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Introduction 


In Differential Equations with Linear Algebra, we endeavor to introduce students 
to two interesting and important areas of mathematics that enjoy powerful 
interconnections and applications. Assuming that students have completed a 
semester of multivariable calculus, the text presents an introduction to critical 
themes and ideas in linear algebra, and then, in its remaining seven chapters, 
investigates differential equations while highlighting the role that linearity plays 
in their study. Throughout the text, we strive to reach the following goals: 


* To motivate the study of linear algebra and differential equations through 
interesting applications in order that students may see how theoretical 
results can answer fundamental questions that arise in physical situations. 


* To demonstrate the fact that linear algebra and differential equations can 
be presented as two parts of a mathematical whole that is coherent and 
interconnected. Indeed, we regularly discuss how the structure of solutions 
to linear differential equations and systems of equations exemplify 
important ideas in linear algebra, and how linear algebra often answers 
key questions regarding differential equations. 


* To present an exposition that is intended to be read and understood by 
students. While certainly every textbook is written with students in mind, 
often the rigor and formality of standard mathematical presentation takes 
over, and books become difficult to read. We employ an examples-first 
philosophy that uses an intuitive approach as a lead-in to more general, 
theoretical results. 


xi 
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* To develop in students a deep understanding of what may be their first 
exposure to post-calculus mathematics. In particular, linear algebra is a 
fundamental subject that plays a key role in the study of much higher level 
mathematics; through its study, as well as our investigations of differential 
equations, we aim to provide a foundation for further study in 
mathematics for students who are so interested. 


Whether designed for mathematics or engineering majors, many universities 
offer a hybrid course in linear algebra and differential equations, and this text 
is written for precisely such a class. At other institutions, linear algebra and 
differential equations are treated in two separate courses; in settings where linear 
algebra is a prerequisite to the study of differential equations, this text may also 
be used for the differential equations course, with its first chapter on linear 
algebra available as a review of previously studied material. More details on the 
ways the book can be implemented in these courses follows shortly in the section 
How to Use this Text. An overriding theme of the book is that if a differential 
equation or system of such equations is linear, then we can usually solve it 
exactly. 


Linear algebra and systems first 


In most other texts that present the subjects of differential equations and linear 
algebra, the presentation begins with first-order differential equations, followed 
by second- and higher order linear differential equations. Following these topics, 
a modest amount of linear algebra is introduced before beginning to consider 
systems of linear differential equations. Here, however, we begin on the very 
first page of the text with an example that shows the natural way that systems 
of linear differential equations arise, and use this example to motivate the 
need to study linear algebra. We then embark on a one-chapter introduction 
to linear algebra that aims not only to introduce such important concepts 
as linear combinations, linear independence, and the eigenvalue problem, 
but also to foreshadow the use of such topics in the study of differential 
equations. 

Following chapter 1, we consider first-order differential equations briefly 
in chapter 2, using the study of linear first-order equations to highlight some 
of the key ideas already encountered in linear algebra. From there, we quickly 
proceed to an in-depth presentation of systems of linear differential equations 
in chapter 3. In that setting, we show how the eigenvalues of an n x n matrix A 
naturally provide the general solution to systems of linear differential equations 
in the form x’ = Ax. Moreover, we include examples that show how any 
single higher order linear differential equation may be converted to a system of 
equations, thus providing further motivation for why we choose to study systems 
first. Through this approach, we again strive to emphasize critical connections 
between linear algebra and differential equations and to demonstrate the most 
important ideas that arise in the study of each. In the remainder of the text, the 
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role of linear algebra is continually emphasized, even in the study of nonlinear 
equations and systems. 


Features of the text 


Instructors and students alike will find several consistent features in the 
presentation. 


+ Each chapter begins with one or two motivating problems that present a 
natural situation—often a physical application—in which linear algebra 
or differential equations arises. From such problems, we work to develop 
related ideas in subsequent sections that enable us to solve the original 
problem. In discussing the motivating problems, we also endeavor to use 
our intuition to predict the solution(s) we expect to find, and then later 
test our results against these predictions. 


In almost every section of the text, we use an examples-first approach. 

By this we mean that we introduce a certain type of problem that we are 
interested in solving, and then consider a relatively simple one that can be 
solved by intuition or ideas studied previously. From the solution of an 
elementary example, we then discuss how this approach can be generalized 
or modified to solve more complex examples, and then ultimately prove 
or state theorems that provide general results that enable the solution of a 
wide range of problems. With this philosophy, we strive to demonstrate 
how the general theory of mathematics comes from experimenting and 
investigating through individual examples followed by looking for overall 
trends. Moreover, we often use this approach to foreshadow upcoming 
ideas: for example, while studying linear algebra, we look ahead to a 
handful of fundamental differential equations. Similarly, early on in 

our investigations of the Laplace transform, we regularly attempt to 
demonstrate through examples how the transform will be used to solve 
initial-value problems. 


While there are many formal theoretical results that hold in both linear 
algebra and differential equations, we have endeavored to emphasize 
intuition. Specifically, we use the aforementioned examples-first approach 
to solve sample problems and then present evidence as to why the details 
of the solution process for a small number of examples can be generalized 
to an overall structure and theory. This is in contrast to many books that 
first present the overall theory, and then demonstrate the theory at work in 
a sequence of subsequent examples. In addition, we often eschew formal 
proofs, choosing instead to present more heuristic or intuitive arguments 
that offer evidence of the truth of important theorems. 


Wherever possible, we use visual reasoning to help explain important 
ideas. With over 100 graphics included in the text, we have provided 
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figures that help deepen students’ understanding and offer additional 
perspective on essential concepts. By thinking graphically, we often find 
that an appropriate picture sheds further light on the solution to a 
problem and how we should expect it to behave, thus adding to our 
intuition and understanding. 


With computer algebra systems (CASs), such as Maple and Mathematica, 
approaching their twentieth year of existence, these technologies are an 
important part of the landscape of the teaching and learning of 
mathematics. Especially in more sophisticated subjects with 
computationally complicated problems, these tools are now indispensable. 
We have chosen to integrate instructional support for Maple directly 
within the text, while offering similar commentary for Mathematica, 
MATLAB, and SAGE on our website, www. oup.com/ 
differentialequations /. For each, students can find directions 
for how to effectively use computer algebra systems to generate important 
graphs and execute complicated or tedious calculations. Many sections of 
the text are followed by a short subsection on “Using Maple to ....” Parallel 
sections for the other CASs, numbered similarly, can be found on the 
website. 


Each chapter ends with a section titled For further study. In this setting, 
rather than a full exposition, a sequence of leading questions is presented 
to guide students to discover some key ideas in more advanced problems 
that arise naturally from the material developed to date. These sections 
can be used as a basis for instructor-led in-class discussions or as the 
foundation for student projects or other assignments. Interested students 
can also pursue these topics on their own. 


How to use this text 


There are two courses for which this text is well-suited: a hybrid course in linear 
algebra and differential equations, or a course in differential equations that 
requires linear algebra as a prerequisite. We address each course separately with 
some suggestions for instructors. 


Linear algebra and differential equations 


For a hybrid course in the two subjects, instructors should begin with chapter 1 
on linear algebra. There, in addition to an introduction to many essential 
ideas in the subject, students will encounter a handful of examples on linear 
differential equations that foreshadow part of the role of linear algebra in the 
field of differential equations. The goal of the chapter on linear algebra is to 
introduce important ideas such as linear combinations, linear independence 
and span, matrix algebra, and the eigenvalue problem. At the close of chapter 1 
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we also introduce abstract vector spaces in anticipation of the structural role 
that vector spaces play in solving linear systems of differential equations and 
higher order linear differential equations. Instructors may choose to move on 
from chapter 1 upon completing section 1.10 (the eigenvalue problem), as this 
is the last topic that is absolutely essential for the solution of linear systems of 
differential equations in chapter 3. Discussion of ideas like basis, dimension, 
and vector spaces of functions from the final two sections of chapter 1 can occur 
alongside the development of general solutions to systems of linear differential 
equations or higher order linear differential equations. 

Over the past decade or two, first-order differential equations have become 
a standard topic that is normally discussed in calculus courses. As such, 
chapter 2 can be treated lightly at the instructor’s discretion. In particular, it 
is reasonable to expect that students are familiar with direction fields, separable 
differential equations, Euler’s method, and several fundamental applications, 
such as Newton’s law of Cooling and the logistic differential equation. It is 
less likely that students will have been exposed to integrating factors as a 
solution technique for linear first-order equations and the solution methods 
for exact equations. In any case, chapter 2 is not one on which to linger. 
Instructors can choose to selectively discuss a small number of sections in class, 
or assign the pages there as a reading assignment or project for independent 
investigation. 

Chapter 3 on systems of linear differential equations is the heart of the 
text. It can be begun immediately following section 1.10 in chapter 1. Here we 
find not only a large number of rich ideas that are important throughout the 
study of differential equations, but also evidence of the essential role that linear 
algebra plays in the solution of these systems. As is noted on several occasions 
in chapter 3, any higher order linear differential equation may be converted to 
a system of first-order equations, and thus an understanding of systems enables 
one to solve these higher order equations as well. Thus, the material in chapter 4 
may be de-emphasized. Instructors may choose to provide a brief overview, in 
class, of how the ideas in solving linear systems translate naturally to the higher 
order case, or may choose to have students investigate these details on their own 
through a sequence of reading and homework assignments or a group project. 
Section 4.5 on beats and resonance is one to discuss in class as these phenomena 
are fascinating and important and the perspective of higher order equations is a 
more natural context in which to consider their solution. 

The Laplace transform is a topic that affords discussion of a variety of 
important ideas: linear transformations, differentiation and integration, direct 
solution of initial-value problems, discontinuous forcing functions, and more. 
In addition, it can be viewed as a gateway to more sophisticated mathematical 
techniques encountered in more advanced courses in mathematics, physics, 
and engineering. Chapter 5 is written with the goal of introducing students 
to the Laplace transform from the perspective of how it can be used to solve 
initial-value problems. This emphasis is present throughout the chapter, and 
culminates in section 5.5. 
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Finally, a course in both linear algebra and differential equations should 
not be considered complete until there has been at least some discussion 
of nonlinearity. Chapter 6 on nonlinear higher order equations and systems 
offers an examination of this concept from several perspectives, all of which 
are related to our previous work with linear differential equations. Direction 
fields, approximation by linear systems, and an introduction to numerical 
approximation with Euler’s method are natural topics with which to round out 
the course. Due to the time required to introduce the subject of linear algebra 
to students, the final two chapters of the text (on numerical methods and series 
solutions) are ones we would normally not expect to be considered in a hybrid 
course. 


Differential equations with a linear algebra prerequisite 


For a differential equations course in which students have already taken linear 
algebra, chapter 1 may be used as a reference for students, or as a source of review 
as needed. The comments for the hybrid course above for chapters 2—5 hold for 
a straight differential equations class as well, and we would expect instructors 
to use the time not devoted to the study of linear algebra to focus more on 
the material on nonlinearity in chapter 6, numerical methods in chapter 7, and 
series solutions in chapter 8. The first several sections of chapter 7 may be treated 
any time after first-order differential equations have been discussed; only the 
final section in that chapter is devoted to systems and higher order equations 
where the methods naturally generalize work with first-order equations. 


In addition to spending more time on the final three chapters of the text, 
instructors of a differential equations-only course can take advantage of the 
many additional topics for consideration in the For further study sections that 
close each chapter. There is a wide range of subjects from which to choose, both 
theoretical and applied, including discrete dynamical systems, how raindrops 
fall, matrix exponentials, companion matrices, Laplace transforms of periodic 
piecewise continuous forcing functions, and competitive species. 


Appendices 


Finally, the text closes with five appendices. The first three—on integration 
techniques, polynomial zeros, and complex numbers—are intended as a review 
of familiar topics from courses as far back in students’ experience as high school 
algebra. The instructor can refer to these topics as necessary and encourage 
students to read them for review. Appendix D is different in that it aims to 
connect some key ideas in linear algebra and differential equations through a 
more sophisticated viewpoint: linear transformations of vector spaces. Some 
of the material there is appropriate for consideration following chapter 1, 
but it is perhaps more suited to discussion after the Laplace transform has 
been introduced. Finally, appendix E contains answers to nearly all of the 
odd-numbered exercises in the text. 


Introduction xvii 


Acknowledgments 


We are grateful to our institutions for the time and support provided to work 
on this manuscript; to several anonymous reviewers whose comments have 
improved it; to our students for their feedback in classroom-testing of the text; 
and to all students and instructors who choose to use this book. We welcome 
all comments and suggestions for improvement, while taking full responsibility 
for any errors or omissions in the text. 


Matt Boelkins/J. L. Goldberg/Merle Potter 


This page intentionally left blank 


Differential Equations with Linear Algebra 


This page intentionally left blank 


1 


Essentials of linear algebra 


1.1 Motivating problems 


The subjects of differential equations and linear algebra are particularly 
important because each finds a wide range of applications in fundamental 
physical problems. We consider two situations that involve systems of equations 
to motivate our work in this chapter and much of the remainder of the text. 

The pollution of bodies of water is an important issue for humankind. 
Environmental scientists are particularly interested in systems of rivers and 
lakes where they can study the flow ofa given pollutant from one body of water 
to another. For example, there is great concern regarding the presence of a 
variety of pollutants in the Great Lakes (Lakes Michigan, Superior, Huron, Erie, 
and Ontario), including salt due to snow melt from highways. Due to the large 
number of possible ways for salt to enter and exit such a system, as well as the 
many lakes and rivers involved, this problem is mathematically complicated. But 
we may gain a feel for how one might proceed by considering a simple system of 
two tanks, say A and B, where there are independent inflows and outflows from 
each, as well as two pipes with opposite flows connecting the tanks as pictured 
in figure 1.1. 

We will let x; denote the amount of salt (in grams) in A at time ¢ (in 
minutes). Since water flows into and out of the tank, and each such flow carries 
salt, the amount of salt x; is changing as a function of time. We know from 
calculus that dx; /dt measures the rate of change of salt in the tank with respect 
to time, and is measured in grams per minute. In this basic model, we can see 
that the rate of change of salt in the tank will be the difference between the net 
rate of salt flowing in and the net rate of salt flowing out. 
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Figure 1.1 Two tanks with inflows, outflows, 
and connecting pipes. 


As a simplifying assumption, we will suppose that the volume of solution in 
each tank remains constant and all inflows and outflows happen at the identical 
rate of 5 liters per minute. We will further assume that the tanks are uniformly 
mixed so that the salt concentration in each is identical throughout the tank at 
a given time f. 

Let us now suppose that the volume of tank A is 200 liters; as we just noted, 
the pipe flowing into A delivers solution at a rate of 5 liters per minute. Moreover, 
suppose that this entering water is contaminated with 4 g of salt per liter. An 
analysis of the units on these quantities shows that the rate of inflow of salt 
into A is 

5liters 4g 


= 29-8 


—— ; (1.1.1) 
min liter min 


There is one other inflow to consider, that being the pipe from B, which we will 
consider momentarily after first examining the behavior of the outflow. 

For the solution exiting the drain from A at a rate of 5 liters/min, observe 
its concentration is unknown and depends on the amount of salt in the tank at 
time ft. In particular, since there are x; g of salt in the tank at time f, and this 
is distributed over the volume of 200 liters, we can say (using the simplifying 
assumption that the tank’s contents stay uniformly mixed) that the rate of 
outflow of salt in each of the exiting pipes is 


5 liters xf — xg 


= : = : (1.1.2) 
min 200liters 40min 


Since there are two such exit flows, this means that the combined rate of outflow 
of salt from A is twice this amount, or x,/20 g/min. 

Finally, there is one last inflow to consider. Note that solution from B is 
entering A at a rate of 5 liters per minute. If we assume that B has a (constant) 
volume of 400 liters, this flow has a salt concentration of x7 g/400 liters. Thus 
the rate of salt entering A from B is 


5 liters 28 M¢MéB 


: : : = 5 (1.1.3) 
min 400liters 80min 
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Combining the rates of inflow (1.1.1) and (1.1.3) and outflow (1.1.2), where 
inflows are considered positive and outflows negative, leads us to the differential 
equation 


dx, x2 Xx, 
dt ~~? * 30” 20 
Since we have two tanks in the system, there is a second differential equation 
to consider. Under the assumptions that B has a volume of 400 liters, the pipe 
entering B carries a concentration of salt of 7 g/liter, and the net rates of inflow 
and outflow match those into A, a similar analysis to the above reveals that 


(1.1.4) 


dx Xx) x2 
=35 


—.= SS 1.1.5 
dt = 40 40 ( 
Together, these two DEs form a system of DEs, given by 
dx x Xx] 
— =204+ — - — 1.1.6 
dt 80 620 ( 
dx _ Xx] x2 
dt 40 40 


Systems of DEs are therefore, seen to play a key role in environmental 
processes. Indeed, they find application in studying the vibrations of mechanical 
systems, the flow of electricity in circuits, the interactions between predators 
and prey, and much more. We will begin our examination of the mathematics 
involved with systems of differential equations in chapter 3. 

An important question related to the above system of DEs leads us to a 
more familiar mathematical situation, one that is the foundation of much of the 
subject of linear algebra. For the system of tanks above, we might ask, “under 
what circumstances is the amount of salt in the two tanks not changing?” In 
such a situation, neither x; nor x2 varies, so the rate of change of each is zero, 
and therefore 

dx, dx) 

dt dt 
Substituting these values into the system of DEs, we see that this results in the 
system of linear equations 


% XI 

0= 20+ — —- — 1.1.7 
+ 30 20 ( ) 
XxX] x2 

0= 35+ — — — 
Ba 40 


Multiplying both sides of the first equation by eighty and the second by forty 
and rearranging terms, we find an equivalent system to be 


4x} -—x»= 1600 
x1 — x2 = —1400 


Geometrically, this system of linear equations represents the set of all points 
that simultaneously lie on each of the two lines given by the respective equations. 
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The solution of such 2 x 2 systems is typically discussed in introductory algebra 
classes where students learn how to solve systems like these with the methods 
of substitution and elimination. Doing so here leads to the unique solution 
x, = 1000, x, = 2400; one interpretation of this ordered pair is that the system 
of two tanks has an equilibrium state where, if the two tanks ever reach this 
level of salinity, that salinity will then stay constant. With further study of 
linear algebra and DEs, we will be able to show that over time, regardless of 
how much salt is initially in each tank, the amount of salt in A will approach 
1000 g, while that in B will approach 2400 g. We will thus call the equilibrium 
point stable. 

Electrical circuits are another physical situation where systems of linear 
equations naturally arise. Flow of electricity through a collection of wires is 
similar to the flow of water through a sequence of pipes: current measures the 
flow of electrons (charge carriers) past a given point in the circuit. Typically, 
we think about a battery as a source that provides a flow of electricity, wires 
as a collection of paths along which the electricity may flow, and resistors 
as places in the circuit where electricity is converted to some sort of output 
such as heat or light. While we will discuss the principles behind the flow 
of electricity in more detail in section 3.8, for now a basic understanding of 
Kirchoffs laws enables us to see an important application of linear systems 
of equations. 

In a given loop or branch j of a circuit, current is measured in amperes (A) 
and is denoted by the symbol J;. Resistances are measured in ohms ({2), and the 
energy produced by the battery is measured in volts. As shown in figure 1.2, we 
use arrows in the circuit to represent the direction of flow of the current; when 
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Figure 1.2 A simple circuit with two loops, two 
batteries, and four resistors. 
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this flow is away from the positive side of a battery (the circles in the diagram), 
then the voltage is taken to be positive. Otherwise, the voltage is negative. 

Two fundamental laws govern how the currents in various loops of the 
circuit behave. One is Kirchoff’s current law, which is essentially a conservation 
law. It states that the sum of all current flowing into a node equals the sum of 
the current flowing out. For example, in figure 1.2 at junction a, 


+h=b (1.1.8) 


Similarly, at junction b, we must have lh = I; + [;. This equation is identical 
to (1.1.8) and adds no new information about the currents. 

Ohm’s law governs the flow of electricity through resistors, and states that 
the voltage drop across a resistor is proportional to the current. That is, V = IR, 
where R is a constant that is the amount of resistance, measured in ohms. For 
instance, in the circuit given in figure 1.2, the voltage drop through the 3-Q 
resistor on the bottom right is V = 3 Q. Kirchoff’s voltage law states that, in any 
closed loop, the sum of the voltage drops must be zero. Since the battery that is 
present maintains a constant voltage, it follows that in the bottom loop of the 
given circuit, 


4f, +2h+3h=5 (1.1.9) 
Similarly, in the upper loop, we have 
61, +2hb = 10 (1.1.10) 


Finally, in the outer loop, taking into account the direction of flow of electricity 
by regarding opposing flows as having opposing signs, we observe 


61; — 41, —3l, = —5+ 10 (1.1.11) 


Taking (1.1.8) through (1.1.11), combining like terms, and rearranging each so 
that indices are in increasing order, we have the system of linear equations 


R- b+ Bb=0 


7h + 2b ts 
2h 6 S10 Se 
_7h Abie 3S 


We will call the system (1.1.12) a 4 x 3 system to represent the fact that it is a 
collection of four linear equations in three unknown variables. Its solution—the 
set of all possible values of (1, h, 3) that make all four equations simultaneously 
true—provides the current in each loop of the circuit. 

In this first chapter, we will develop our understanding of the more general 
situation of systems of linear equations with m linear equations in n unknown 
variables. This problem will lead us to consider important ideas from the theory 
of matrices that play key roles in a variety of applications ranging from computer 
graphics to population dynamics; related ideas will find further applications in 
our subsequent study of systems of differential equations. 
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1.2 Systems of linear equations 


Linear equations are the simplest of all possible equations and are involved 
in many applications of mathematics. In addition, linear equations play a 
fundamental role in the study of differential equations. As such, the notion 
of linearity will be a theme throughout this book. Formally, a linear equation in 


variables x), ...,X, is one having the form 
AX, + dn X2 +++» + AyXy =D (1.2.1) 
where the coefficients a,,..., a, and the value b are real or complex numbers. 


For example, 
2x1 + 3x2 — 523 = 7 
is a linear equation, while 
i + sinx, — x3Inx, =5 


is not. Just as the equation 2x; + 3x2 = 7 describes a line in the x;—x2 plane, the 
linear equation 2x, + 3x, — 5x3 = 7 determines a plane in three-dimensional 


space. 
A system of m linear equations in n unknown variables is a collection of m 
linear equations in n variables, say x),..., x,. We often refer to such a system as 


an “m x n system of equations.” For example, 


xX) + 22+ 73=1 


9 4 oy ee =O (1.2.2) 


is a system of two linear equations in three unknown variables. A solution to the 
system is any point (x, x2, x3) that makes both equations simultaneously true; 
the solution set for (1.2.2) is the collection of all such solutions. Geometrically, 
each of these two equations describes a plane in three-dimensional space, as 
shown in figure 1.3, and hence the solution set consists of all points that lie on 
both of the planes. Since the planes are not parallel, we expect this solution set to 


x,+ 2x,+ x, = 1 


x,+ x,+ 2x, =0 


2 x, 
x 


Figure 1.3 The intersection of the planes x; + 2x. + 
x3 = land x, +x. + 2x3 =0. 
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form a line in R*. Note that R denotes the set of all real numbers; R? represents 
familiar three-dimensional Euclidean space, the set of all ordered triples with 
real entries. 

The solution set for the system (1.2.2) may be determined using elementary 
algebraic steps. We say that two systems are equivalent if they share the same 
solution set. For example, if we multiply both sides of the first equation by —1 
and add this to the second equation, we eliminate x; in the second equation and 
get the equivalent system 


Xp + 222 +3 = 
—-%Y2 +%=-1 


Next, we multiply both sides of the second equation by —1 to get 


x1 + 2x2 + 3 = 1 
y—- 3 = 1 


Finally, if we multiply the second equation by —2 and add it to the first equation, 
it follows that 

A. aaa =e (1.2.3) 

Y- B= 

This shows that any solution (x1, x2, x3) of the original system must satisfy the 
(simpler) equivalent system of equations x; = —1 — 3x3 and x. = 1+ x3. Said 
differently, any point in R} of the form (—1— 3x3, 1+.x3, x3), where x3 € R (here 
the symbol ‘e’ means is an element of ), is a solution to the system. Replacing 
x3 by the parameter t, we recognize that the solution to the system is the line 
parameterized by 


(-1-—3t,1+t,t), teR (1.2.4) 


which is the intersection of the two planes with which we began, as seen in 
figure 1.3. Note that this shows there are infinitely many solutions to the given 
system of equations; a particular example of such a solution may be found by 
selecting any value of t (i.e., any point on the line). We can also check that the 
resulting point makes both of the original equations true. 

It is not hard to see in the 2 x 2 case that any linear system has either no 
solution (the lines are parallel), a unique solution (the lines intersect once), or 
infinitely many solutions (the two equations represent the same line). These 
three options (no solution, exactly one solution, or infinitely many) turn out to 
be the only possible cases for any m x n system of linear equations. A system 
with at least one solution is said to be consistent, while a system with no solution 
is called inconsistent. 

In our work above from (1.2.2) to (1.2.3) in reducing the given system of 
equations to a simpler equivalent one, it is evident that the coefficients of the 
system played the key role, while the variables x), x., and x3 (and the equals sign) 
were essentially placeholders. It proves expedient to therefore change notation 
and collect all of the coefficients into a rectangular array (called a matrix) and 
eliminate the redundancy of repeatedly writing the variables. Let us reconsider 
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our above work in this light, where we will now refer to rows in the coefficient 
matrix rather than equations in the original system. When we create a right-most 
column consisting of the constants from the right-hand side of each equation, 
we often say we have an augmented matrix. 

From the ‘simplest’ version of the system at (1.2.3), the corresponding 


augmented matrix is 
10 3 -1 
01-1 1 


The 0’s represent variables that have been eliminated in each equation. From 
this, we see that our goal in working with a matrix that represents a system 
of equations is essentially to introduce as many zeros as possible through 
operations that do not change the solution set of the system. We now repeat the 
exact same steps we took with the system above, but translate our operations to 
be on the matrix, rather than the equations themselves. 

We begin with the augmented matrix 


1211 
1 12 0 


To introduce a zero in the bottom left corner, we add —1 times the first row to 
the second row, to yield a new row 2 and the updated matrix 


1 21 1 
0-1 1-1 


The ‘0’ in the second entry of the first column shows that we have eliminated 
the presence of the x; variable in the second equation. Next, we can multiply 
row 2 by —1 to obtain an updated row 2 and the augmented matrix 


1 2 1 1 
01-1 1 
Finally, if we multiply row 2 by —2 and add this to row 1, we find a new row 1 


and the matrix 
1 0 3 -l1 
01 -!i 1 


At this point, we have introduced as many zeros as possible!, and have arrived 
at our goal of the simplest possible equivalent system. We can reinterpret the 
matrix as a system of equations: the first row implies that x, + 3x3; = —1, while 
the second row implies x. — x3 = 1. This leads us to find, as we did above, 
that any solution (x1, x, x3) of the original system must be of the form (—1 — 
3x3, 1+ x3, x3), where x3 € R. 


1 Any additional row operations to introduce zeros in the third or fourth columns will replace the 
zeros in columns | or 2 with nonzero entries. 
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We will commonly need to refer to the number of rows and columns in a 
matrix. For example, the matrix 


10 3 -1 
E 1 -1 4 
has two rows and four columns; therefore, we say this is a 2 x 4 matrix. In 
general, an m x n matrix has m rows and n columns. Observe that if we 
have a 2 x 3 system of equations, its corresponding augmented matrix will be 
2x4. 

The above example demonstrates the general fact that there are basic 
operations we can perform on an augmented matrix that, at each stage, result 
in the matrix representing an equivalent system of equations; that is, these 
operations do not change the solution to the system, but rather make the solution 
more easily obtained. In particular, we may 


1. Replace one row by the sum of itself and a multiple of another row; 
2. Interchange any two rows; or 


3. Scale a row by multiplying every entry in a given row by a fixed nonzero 
constant. 


These three types of operations are typically called elementary row operations. 
Two matrices are row equivalent if there is a sequence of elementary row 
operations that transform one matrix into the other. When matrices are used 
to represent systems of linear equations, as was done above, it is always the case 
that row-equivalent matrices correspond to equivalent systems. 

We desire to use elementary row operations systematically to produce row 
equivalent matrices from which we may easily interpret the solution to a system 
of equations. For example, the solution to the system represented by 


100 —5 
010 6 (1.2.5) 
001 -3 

is easy to obtain (in particular, x; = —5, x. = 6, x3 = —3), while the solution for 


is not, even though the two matrices are equivalent. Therefore, we desire each 
variable in the system to be represented in its corresponding augmented matrix 
as infrequently as possible. Essentially our goal is to get as many columns of the 
matrix as possible to have one entry that is 1, while all the rest of the entries in 
that column are 0. 
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A matrix is said to be in reduced row echelon form (RREF) if and only if 
the following characteristics are satisfied: 


+ All nonzero rows are above any rows with all zeros 


+ The first nonzero entry (or leading entry) in a given row is 1 and is ina 
column to the right of the first nonzero entry in any row above it 


+ Every other entry in a column with a leading 1 is 0 


For example, the matrix in (1.2.5) is in RREF, while the matrix 


1-2 4 —-5 
0 2 7 6 
0 0O -—3 —3 


is not, since two of the rows lack leading 1’s, and columns 2 and 3 lack zeros in 
the entries above the lowest nonzero locations. 

Each leading 1 in RREF is said to be in a pivot position, the column in which 
the 1 lies is termed a pivot column, and the leading 1 itself is called a pivot. 
Rows with all zeros do not contain a pivot position. The process by which row 
operations are applied to a matrix to convert it to RREF is usually called Gauss— 
Jordan elimination. We will also say that we “row-reduced” a given matrix. 
While this process can be described in a somewhat cumbersome algorithm, it is 
best demonstrated with a few examples. By working through the details of the 
following problems (in particular by deciding which elementary row operations 
were performed at each stage), the reader will not only learn the basics of row 
reduction, but also will see and understand the key possibilities for the solution 
set of a system of linear equations. 


Example 1.2.1 Solve the system of equations 


| 
0 


3x) $2%y.- B= 
x, — 4x2 + 223 = —9 (1.2.6) 
—2xy,+ 2+ 3=-1 


Solution. We begin with the corresponding augmented matrix 


3 2-1 8 
1-4 2 -9 
—2 1 1 -l 


and then perform a sequence of row operations. The arrows below denote the 
fact that one or more row operations have been performed to produce a row 
equivalent matrix. We find that 


3 2-1 8 1-4 2 -9 1-4 2 —-9 
1-4 2 -9|—> 3 2-1 8);—>]0 14 -7 35/—> 
—2 1 1 -l —2 1 1 -l 0 —7 5 —-19 
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Figure 1.4 The intersection of the three 
planes given by the linear system (1.2.6). 


1-4 2 -9 10 0 1 10 0 1 
1 é) 1 5 1 65 
beg aloe tee gia t= ale 
a 3 
-7 5 -19 oo 3 -3 00 1-1 
100 1 
010 2 
001 -1 


This shows us that the original 3 x 3 system has a unique solution, and that this 
solution is the point (1, 2, —1). Geometrically, this demonstrates that the three 
planes with equations given by the system (1.2.6) meet in a single point, as we 
can see in figure 1.4. 


Example 1.2.2 Solve the system of equations 


xX) +2%2»- B= 1 
Xi + X% =2 (1.2.7) 
3x, + 2 +2x%3= 8 


Solution. We consider the corresponding augmented matrix 


12 -1 1 
11 02 
3 1 2 8 
and again perform a sequence of row operations: 
12-11 1 2-1 1 1 2-1 1 10 1 3 
11 02};—>);70-1 11);7>/]/0 1-1 -1;>/01-1 -1 
3 1 28 0-5 55 0-5 5 5 00 0 0 


In this case, we see that one row of the matrix has essentially vanished. This 
shows that one of the equations in the original system was redundant, and 
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did not contribute any restrictions on the system. Moreover, as the matrix is 
now in RREF, we can see that the simplest equivalent system is given by the 
two equations x; + x3 = 3 and x) — x3 = —1. In other words, x, = 3 — x3 and 
x2 = —1+ x3. Since the variable x3 has no restrictions on it, we call x3 a free 
variable. This implies that the system under consideration has infinitely many 
solutions, each having the form 


(3—t,—-l1+t,t), whereteR (1.2.8) 


In the next section, we will begin to emphasize the role that vectors play in 
systems of linear equations. For example, the ordered triple (3 — t, -1+ t, t) 
in (1.2.8) may be viewed as a vector in R?. In addition, the representation (1.2.8) 
of the set of all solutions involving the parameter ft is often called the parametric 
vector form of the solution. As we saw in the very first system of equations 
discussed in this section, example 1.2.2 shows that the three planes given in the 
system (1.2.7) meet in a line. 


Example 1.2.3 Solve the system of equations 


x, +2%2»- B=1 
xi + x» = 2 
3x) + 2» +2%3=7 


Solution. Observe that the only difference between this example and the 
previous one is that the “8” in the third equation has been replaced with “7.” 
We proceed with identical row operations to those above and find that 


12-11 L 2 =1 1 1 2-1 1 10 1 3 
11 02};—> 70-1 11);7>/]/0 1-1 -1;—>/01-1 -1 
31 27 0-5 5 4 0-5 5 4 00 0-1 


In this case, the final row of the reduced matrix corresponds to the equation 
Ox, + 0x. + 0x3 = —1. Since there are no points (x1, x2, x3) that make this 
equation true, it follows that there can be no points which simultaneously satisfy 
all three equations in the system. Said differently, the three planes given in the 
original system of equations do not meet at a single point, nor do they meet in 
a line. Therefore, the system has no solution; recall that we call such a system 
inconsistent. 


Note that the only difference between example 1.2.2 and example 1.2.3 is one 
constant in the righthand side in the equation of one of the planes. This changed 
the result dramatically, from the case where the system had infinitely many 
solutions to one where no solutions were present. This is evident geometrically 
if we think about a situation where three planes meet in a line, and then we alter 
the equation of one of the planes to shift it to a new plane parallel to its original 
location: the three planes will no longer have any points in common. 
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Algebraically, we can see what is so special about the one constant we changed 
(8 to 7) if we replace this value with an arbitrary constant, say k, and perform 
row operations: 


12 -1 1 1 2-1 1 10 1 3 
11 02})}—+)],0 1 -!1 -l}—>]0 1-1 —1 
3 1 2k 0-5 5 k-3 00 O k-8 


This shows that for any value of k other than 8, the resulting system of linear 
equations will be inconsistent, therefore having no solutions. In the case that 
k = 8, we see that a free variable arises and then the system has infinitely many 
solutions. 

Overall, the question of consistency is an important one for any linear 
system of equations. In asking “is this system consistent?” we investigate whether 
or not the system has at least one solution. Moreover, we are now in a position 
to understand how RREF determines the answer to this question. We note from 
considering the RREF of a matrix that there are two overall cases: either the 
system contains an equation of the form 0x; +---+0x, = b, where b is nonzero, 
or it has no such equation. In the former case, the system is inconsistent and 
has no solution. In the latter case, it will either be that every variable is uniquely 
determined, or that there are one or more free variables present, in which case 
there are infinitely many solutions to the system. This leads us to state the 
following theorem. 


Theorem 1.2.1 For any linear system of equations, there are only three possible 
cases for the solution set: there are no solutions, there is a unique solution, or 
there are infinitely many solutions. 


This central fact regarding linear systems will play a key role in our studies. 


1.2.1 Row-reduction using Maple 


Obviously one of the problems with the process of row reducing a matrix 
is the potential for human arithmetic errors. Soon we will learn how to use 
computer software to execute all of these computations quickly; first, though, 
we can deepen our understanding of how the process works, and simultaneously 
eliminate arithmetic mistakes, by using a computer algebra system in a step-by- 
step fashion. Our software of choice is Maple. For now, we only assume that the 
user is familiar with Maple’s interface, and will introduce relevant commands 
with examples as we go. 

We will use the LinearAlgebra package in Maple, which is loaded using 
the command 


> with(LinearAlgebra) : 


(The symbol ‘>’ is called a Maple prompt; the program makes this available to 
the user automatically, and it should not be entered by the user.) To demonstrate 
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various commands, we will revisit the system from example 1.2.1. The reader 
should explore this code actively by entering and experimenting on his or her 
own. Recall that we were interested in row-reducing the augmented matrix 


3 2-1 8 
1-4 2 -9 
—2 1 1 -l 


We enter the augmented matrix, say A, column-wise in Maple with the 
command. 


> A := <<3,1,-2>|<2,-4,1>|<-1,2,1>|<8,-9,-1>>; 
We first want to swap rows | and 2; this is accomplished by entering 
> Al := RowOperation(A, [1,2]); 
Note that this stores the result of this row operation in the matrix Al, which is 


convenient for use in the next step. After executing the most recent command, 
the following matrix will appear on the screen: 


1-4 2 -9 
Al:= 30 2S] 8 
—2 1 1 -l 


To perform row-replacement, our next step is to add (—3)- R, to Rp (where 
rows 1 and 2 are denoted R; and R2) to generate a new second row; similarly, 
we will add 2- R; to R3 for an updated row 3. The commands that accomplish 
these steps are 


> A2 := RowOperation(Al,[2,1],-3); 
> A3 RowOperation(A2,[3,1],2); 


and lead to the following output: 
1-4 2 -9 
A3:=/]0 14 —-7 35 
0 —7 5 —-19 
Next, we will scale row 2 by a factor of 1/14 using the command 


> A4 := RowOperation(A3,2,1/14); 


to find that 
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The remainder of the computations in this example involve slightly modified 
versions of the three versions of the RowOperation command demonstrated 
above, and are left as an exercise for the reader. Recall that the unique solution 
to the original system is (1,2, —1). 

Maple is certainly capable of performing all of these steps at once. After 
completing each step-by-step command above in the row-reduction process, 
the result can be checked by executing the command 


> ReducedRowEchelonForm (A) ; 


The corresponding output should be 


100 1 
010 2 
001-1 


which clearly reveals the unique solution to the system, (1, 2, —1). 


Exercises 1.2 In exercises 1—4, solve each system of equations or explain why 
no solution exists. 


1. x} + 2x) = 1 
xy + ym =0 
2 xX) +2%= 1 
—2x, — 4x), = —2 
Be x) t+2%= 1 
—2x, — 4x, = -3 
4. 4x) — 3x. =5 
—x,) +4ym=2 
In exercises 5—9, for each linear system represented by a given augmented matrix 
in RREF, decide whether or not the system is consistent or not. If the system is 


consistent, determine its solution set. For systems with infinitely many solutions, 
express the solution in parametric vector form. 


5/100 4 


010 -2 
001 3 
6.f100 4 
011-2 
0 0 3 
7.[1 0 2 -3 
011-2 
000 0 
00 0 0 
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8.[1 0 0 -3 5 
001-24 
000 00 

9.f1 -2 040 -1 
0130 | 
0 0001 - 
0 0000 a 


In exercises 10-14, the given augmented matrix represents a system for which 
some row operations have been performed to partially row-reduce the matrix. 
By deciding which operations must next be executed, finish row-reducing each 
matrix. Finally, interpret your results to state the solution set to the system. 


10.f1 3 2 5 
01-4 -1 
00 1 7 

l.f100 4 
000 3 
011 -2 

12.[1 0 2 -3 
011 -2 
03 3 -6 
lo 22 =1 

13.[1 0 5 -1 6 
002 -8 2 
000 00 

14.[1 -3 05 0 -3 
0 0130 4 
0 0012 -9 
0 0001 4 


Determine all value(s) of h that make each augmented matrix in exercises 15-18 
correspond to a consistent linear system. For such h, describe the solution set 
to the system. 
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Use a computer algebra system to perform step-by-step row operations to solve 
each of the following linear systems in exercises 19-23. Ifthe system is consistent, 
determine its solution set. For systems with infinitely many solutions, express 
the solution in parametric vector form. 


19, x - H+ B=5 
2x, — 4x2 + 3x3 = 0 
xX, — 6x2. + 2x3 = 3 


20. 4x, + 2%. - 3 = -—2 
X- M+ B= 6 
—3x, + x2 — 4x3 = —20 


21. 4x, + 2%. - »%=-2 
Xi- w+ w= 6 
—2x, — 4%. + 3x3 = 14 


22. 4x, + 2%. -— x13 = —2 
Xy- H+ 3= 6 
—2x, — 4x. + 3x3 = 13 


23. 2x2 + 3x3 — 4x4 = 1 
2x3 + 3x4 =4 

2x1 + 2x2 — 5x3 + 2x, = 4 
2X] — 6x3 + 9x4 = 7 


In exercises 24-27, determine whether or not the given three lines or planes 
meet in a single point. Justify your answer using appropriate row operations. 


24. x1 + x2 = 5, 2x, — 3x2 = —5, —4x, + 2x) = —2 
25. xy + x2 = 5, 2x, — 3x2 = —5, —4x, + 2x) = —3 
26. X1 +X. + x3 = 5, 2x] — 3x2 +23 = 1, —4x, + 2%) +523 = 4 
27. X1 +2) +23 =5, 2x) — 32) +3 = 3, —4x) + 2x) +523 = 4 


28. Consider a linear system whose corresponding augmented matrix has all 
zeros in its final column. Is it ever possible for such a system to be 
inconsistent? Why or why not? 


29. Is it possible for a 2 x 3 linear system to be inconsistent? Explain. 


30. Ifa 3 x 4 linear system has three pivot columns in its corresponding 
augmented matrix, can you determine whether or not the system must be 
consistent? Explain. 


31. A system of linear equations has a unique solution. What can be 
determined about the relationship between the number of pivot columns 
in the augmented matrix and the number of variables in the system? 
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32. 


33. 


34. 


35. 


36. 
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Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) Two lines must either intersect or be parallel. 

(b) A system of three linear equations in three unknown variables can 
have exactly three solutions. 

(c) If the RREF of a matrix has a row of all zeros, then the corresponding 
system must have a free variable present. 

(d) If a system has a free variable present, then the system has infinitely 
many solutions. 

(e) A solution to a 4 x 3 linear system is a list of four numbers 
(x1, X2, x3, x4) that simultaneously makes every equation in the system 
true. 

(f) A matrix with three columns and four rows is 3 x 4. 

(g) A consistent system is one with exactly one solution. 


Suppose that we would like to find a quadratic function 

p(t) = a t? + a,t + ag that passes through the three points (1, 4), (2, 7), 
and (3, 6). How does this problem lead to a system of linear equations? 
Find the function p(t). (Hint: p(1) = 4 implies that 4 = ayl* + a,1+ a.) 


Find a quadratic function p(t) = a)t? + a,t + ag that passes through the 
three points (—1, 1), (2, —1), and (5, 4). How does this problem involve a 
system of linear equations? 


For the circuit shown at the left in figure 1.5, set up and solve a system of 
linear equations whose solution is the respective currents , h, and [3. 


For the circuit shown at the right in figure 1.5, set up and solve a system of 
linear equations whose solution is the respective currents , h, and [3. 
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Figure 1.5 Circuits for use in exercises 35 and 36. 
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1.3 Linear combinations 


An important theme in mathematics that is especially present in linear algebra 
is the value of considering the same idea from a variety of different perspectives. 
Often, we can make statements that on the surface may seem unrelated, when 
in fact they ultimately mean the same thing, and one of the statements is most 
advantageous for solving a particular problem. Throughout our study of linear 
algebra, we will see that the subject offers a wide variety of perspectives and 
terminology for addressing the central concept: systems of linear equations. In 
this section, we take another look at the concept of consistency, but do so ina 
different, geometric light. 


Example 1.3.1 Consider the system of equations 


Xi - wM=1 
xt Hm =3 (1.3.1) 
xj + 2% =4 


Rewrite the system in vector form and explore how two vectors are being 
combined to form a third, particularly in terms of the geometry of R*. Then 
solve the system. 


Solution. In multivariable calculus, we learn to think of vectors in R? very 
much like we think of points. For example, given the point (a, b,c), we may 
write v = (a, b,c) or v = ai+ bj+ ck to denote the vector v that emanates 
from (0, 0,0) and ends at (a, b, c). (Here i, j, and k represent the standard unit 
coordinate vectors: i is the vector from (0, 0, 0) to (1, 0, 0), j to (0, 1,0), andk to 
(0, 0, 1).) 

In linear algebra, we will prefer to take the perspective of writing such an 
ordered triple as a matrix with only one column, also known as a column vector, 
in the form 


v=|b (1.3.2) 


To save space, we will sometimes use the equivalent notation? v=[a b c]!. 
Recall that two vectors are equal if and only if their corresponding entries are 
equal, that a vector may be multiplied by a scalar, and that any two vectors of 
the same size may be added. 


2 The ‘T” stands for transpose, and the transpose of a matrix is achieved by turning every column 
into a row. 
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We can now re-examine the system of equations (1.3.1) in the light of 
equality among vectors. In particular, observe that it is equivalent to say 


Xx] — x2 1 
Xj +x | = 3 (1.3.3) 
xy + 2x 4 


since two vectors are equal if and only if their corresponding entries are 
equal. Recalling further that vectors are added component-wise, we can 
rewrite (1.3.3) as 


xX] —x2 1 
xy |+ mM }=] 3 (1.3.4) 
xX] 2x2 4 


Finally, we observe in (1.3.4) that the first vector on the left-hand side has 
a common factor of x; in each component, and the second vector similarly 
contains x2. Since a scalar multiple of a vector is computed component-wise, 
here we can rewrite the equation once more, now in the form 


ih —1 1 
xy] 1 | +x 1/=] 3 (1.3.5) 
1 2 4 


Equation (1.3.5) is equivalent to the original system (1.3.1), but is now being 
viewed in a very different way. Specifically, this last equation asks if there are 
values of x; and x2 for which 


X1V] +22v2 =b 


where 
1 —1 1 
y=!/1], w= 1], andb=}]} 3 (1.3.6) 
1 2 4 


If we plot the vectors vj, v2, and b, an interesting situation comes to light, as 
seen in figure 1.6. In particular, it appears as if all three vectors lie in the same 
plane. Moreover, if we think about the parallelogram law of vector addition and 
stretch the vector v; by a factor of 2, we see the image in figure 1.7. This shows 
geometrically that it appears b = 2v, + v2; a quick check of the vector arithmetic 
confirms that this is in fact the case. In other words, the unique solution to the 
system (1.3.1) is x} = 2 and x, = 1. 


Among the many important ideas in example 1.3.1, perhaps most significant 
is the way we were able to re-cast a problem about a system of linear equations 
as a question involving vectors. In particular, we saw that it was equivalent to 
ask if there exist constants x; and x2 such that 


X1V] +22v2 =b (1.3.7) 
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Figure 1.6 The vectors vj,v2, and b 
from (1.3.6). 


Figure 1.7 The parallelogram formed by 
the vectors 2v, and v2 from (1.3.6). 


Note that in (1.3.7), we are only taking scalar multiples of vectors and adding 
them—computations that are linear in nature. We thus naturally come to use 
the terminology that “x1v1 + x2v2 is a linear combination of the vectors v; and 
v2.” A more general definition now follows, from which we will be able to widen 
our perspective on systems of linear equations. 


Definition 1.3.1 Ifv,,...,v% are vectors in R” (that is, each v; is a vector with 
n entries), and x,,..., x; are scalars, then the vector b given by 


b= xv) +--+ + xKEVK (1.3.8) 


is a linear combination of the vectors v,,...,V%, with weights or coefficients 
X1,--+,Xk- 


Note the notational convention we use, as in example 1.3.1: a bold, non- 
italicized, lowercase variable, say x, represents a vector, while a non-bold, 
italicized, lower-case variable, say c, denotes a scalar. A bold, non-italicized, 
uppercase variable, say A, will represent a matrix with at least two columns. 
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In light of this new terminology of linear combinations, in example 1.3.1 
we saw that the question “is there a solution to the linear system (1.3.1)?” is 
equivalent to asking “is the vector b a linear combination of the vectors v; 
and v2?” 

If we now consider the more general situation of a system of linear 
equations, say 


Ay X1 + 2X2 + +++ + Ay Xy = by 


Ag X + Ag2X2 +++ + Ayn Xn = bg 


Ami X1 + Am2x2 +++ + AmnXn = bm 


it follows (as in section 1.2) that we can view this system in terms of the 
augmented matrix 


[aj a2 --- a, b] 


where a, is the vector in R™ representing the first column of the augmented 
matrix, and so on. Now, however, we have the additional perspective, as in 
example 1.3.1, that the columns of the augmented matrix A are precisely the 
vectors being used to form a linear combination in an attempt to construct b. 
That is, the general m x n linear system above asks the question, “is b a linear 
combination of aj,..., an?” 

We make the connection between linear combinations and augmented 
matrices more explicit by defining matrix—vector multiplication in terms of linear 
combinations. 


Definition 1.3.2. Given an m x n matrix A with columns aj,...,a, that are 
vectors in R”, if x is a vector in IR”, then we define the product Ax by the 
equation 


Ax = [a1 a2 --- an) |. | = xyar + x9a2 +++ + Xnan (1.3.9) 


That is, the matrix—vector product of A and x is the vector Ax obtained 
by taking the linear combination of the column vectors of A according to the 
weights prescribed by the entries in x. Certainly we must have the same number 
of entries in x as columns in A, or Ax will not be defined. The following example 
highlights how to compute and interpret matrix—vector products. 


Example 1.3.2 Let a; =[1 —4 2]! and ay =[—3 1 5]', and let A be the 
matrix whose columns are a; and az. Compute Ax, where x = [—5 2]', and 
interpret the result in terms of linear combinations. 
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Solution. By definition, we have that 


L 33") ps 1 —3 -11 
Ax=| -4 1 3|=-3 4/42] 1/=| 22 
5 2 5 0 


The above computations show clearly that the vector Ax =[—11 22 0]' isa 
linear combination of a; and ap. 


Following a few more computational examples in homework exercises, the 
reader will quickly see how to compute the product Ax whenever it is defined; 
usually we skip past the intermediate stage of writing out the explicit linear 
combination of the columns and simply write the resulting vector. Matrix— 
vector multiplication also has several important general properties, some of 
which will be explored in the exercises. For now, we simply list these properties 
here for future reference: for any m x n matrix A, vectors x, y € R", andc eR, 


+ A(x+y) = Ax+ Ay 
* A(cx) = c(Ax) 


The first property shows that matrix multiplication distributes over addition; 
the second demonstrates that a scalar multiple can be taken either before or after 
multiplying the vector x by A. These two properties of matrix multiplication are 
often referred to as being properties of linearity—note the use of only scalar 
multiplication and vector addition in each, and the linear appearance of each 
equation.? Finally, note that it is also the case that AO, = 0, where 0, is the 
vector in R” with all entries being zero, and 0,,, is the corresponding zero vector 
in R”. 

There is one more important perspective that this new matrix—vector 
product notation permits. Recall that, in example 1.3.1, we learned that the 
question “is b a linear combination of a; and a?” is equivalent to asking “is 
there a solution to the system of linear equations whose augmented matrix has 
columns aj, a2, and b?” Now, in light of matrix—vector multiplication, we also 
see that the question “is b a linear combination of a; and a2?” may be rephrased 
as asking “does there exist a vector x such that Ax = b?” That is, are there 
weights x; and x2 (the entries in vector x) such that b is a linear combination of 
the columns of A? 

In particular, we may now adopt the perspective that we desire to solve the 
equation Ax = b for the unknown vector x, where A is a matrix whose entries 
are known, and b is a vector whose entries are known. This equation is strikingly 
similar to the most elementary of equations encountered in algebra, ones such as 
2x = 7. Therefore, we see that the linear equation Ax = b, involving matrices and 
vectors, is of fundamental importance as it is another way of expressing questions 


3 A deeper discussion of the notion of linear transformations can be found in appendix D. 
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regarding linear combinations and solutions of systems of linear equations. In 
subsequent sections, we will explore this equation from several perspectives. 


1.3.1 Markov chains: an application of 
matrix-vector multiplication 


People are often distributed naturally among various groupings. For example, 
much political discussion in the United States is centered on three classifications 
of voters: Democrat, Republican, and Independent. A similar situation can be 
considered with regard to peoples’ choices for where to live: urban, suburban, or 
rural. In each case, the state of the population at a given time is its distribution 
among the relevant categories. 

Furthermore, in each of these situations, it is natural to assume that if we 
consider the state of the system at a given point in time, its state depends on the 
system’s state in the preceding year. For example, the percentage of Democrats, 
Republicans, and Independents in the year 2020 ought to be connected to the 
respective percentages in 2019. 

Let us assume that a population of voters (of constant size) is considered in 
which every-one must classified as either D, R, or I (Democrat, Republican, or 
Independent). Suppose further that a study of voter registrations over many 
years reveals the following trends: from one year to the next, 95 percent 
of Democrats keep their registration the same. For the remaining 5 percent 
who change parties, 2 percent become Republicans and 3 percent become 
Independents. Similar data for Republicans and Independents is given in the 
following table. 


Future party ({)/current party (>) | D(%) | R(%) | I(%) 
Democrat 95 3 7 
Republican 2 90 13 
Independent 3 7 80 


If we let Dy, Ry, and I, denote the respective numbers of registered 
Democrats, Republicans, and Independents in year n, then the table shows 
us how to determine the respective numbers in year n+ 1. For example, 


Dy+1 = 0.95Dy, + 0.03 Ry + 0.071, (1.3.10) 


since 95 percent of the Democrats in year n stay registered Democrats, and 
3 percent of Republicans and 7 percent of Independents change to Democrats. 
Similarly, we have 


Rn41 = 0.02D, + 0.90R, + 0.131, (1.3.11) 
Ing.) =0.03Dy + 0.07R, + 0.80], (1.3.12) 
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If we combine (1.3.10), (1.3.11), and (1.3.12) in a single vector equation, then 


Dina 0.95 0.03 0.07 
Rati | =D, | 0.02 | +R, | 0.90 | +1, } 0.13 (1.3.13) 
jee 0.03 0.07 0.80 


Here we find that linear combinations of vectors have naturally arisen. Note, 
for example, that the vector [0.03 0.90 0.07] is the Republican vector, and 
represents the likelihood that a Republican in a given year will be in one 
of the three parties in the following year. More specifically, we observe that 
probabilities are involved: a Republican has a 3 percent likelihood of registering 
as a Democrat in the following year, a 90 percent likelihood of staying a 
Republican, and 7 percent chance of becoming an Independent. The sum of 
the entries in each column vector is 1. 
If we use the vector x") to represent 


Dn 
x”) =] R, 
In 


and use matrix-vector multiplication to represent the linear combination of 
vectors in (1.3.13), then (1.3.13) is equivalently expressed by the equation 


xt) — Mx”) (1.3.14) 
where M is the matrix 


0.95 0.03 0.07 
M= | 0.02 0.90 0.13 
0.03 0.07 0.80 


The matrix M is often called a transition matrix since it shows how the population 
transitions from state n to state n+ 1. We observe that in order for such a 
matrix to represent the probabilities that groups in a particular set of states will 
transition to another set of states, the columns of the matrix M must be non- 
negative and add to 1. Such a matrix is called a stochastic matrix or a Markov 
matrix. Finally, we call any system such as the one with three classifications of 
voters, where the state of the system in a given observation period results from 
applying probabilities to a previous state, a Markov chain or Markov process. 

We see, for example, that if we had a group of 250000 voters that at year 
n = 0 was distributed among Democrats, Republicans, and Independents by 
the vector (with entries measured in thousands) x = [120 110 20] then we 
can easily compute the projected distribution of voters in subsequent years. In 
particular, (1.3.14) implies 


118.70 117.80 117.18 
x) — Mx — 104], x?) =Mx) =| 99.52], x°) =Mx2) =| 96.18 
27.3 32.68 36.65 
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Interestingly, if we continue the sequence, we eventually find that there is very 
little variation from one vector x”) to the next. For example, 


116.67 116.79 
x?7) —] 95.95 | xx"8) — | 85.76 
47.42 47.44 


In fact, as we will learn in our later study of eigenvectors, there exists a vector 

x* called the steady-state vector for which x* = Mx”. This shows that the system 

can reach a state in which it does not change from one year to the next. 
Another example is instructive. 


Example 1.3.3. Geographers studying a metropolitan area have observed a 
trend that while the population of the area stays roughly constant, people within 
the city and its suburbs are migrating back and forth. In particular, suppose that 
85 percent of people whose homes are in the city keep their residence from one 
year to the next; the remainder move to the suburbs. Likewise, while 92 percent 
of people whose homes are in suburbs will live there the next year, the other 
8 percent will move into the city. 

Assuming that in a given year there are 230000 people living in the city and 
270000 people in the surrounding suburbs, predict the population distribution 
over the next 3 years. 


Solution. If we let C,, and S,, denote the populations of the city and suburbs 
in year n, the given information tells us that the following relationships hold: 


Cnt = 0.85C, + 0.08S) 
Sn = 0.15Cy + 0.92Sy 


(n) _} Cn 
1 =[ S| 


we can model the changing distribution of the population between the city and 
suburbs with the Markov process x‘"+!) = Mx”), where M is the Markov matrix 


ie 0.85 0.08 


Using the notation 


0.15 0.92 


In particular, starting with x9) — [230 270]', we see that 
<= [217-10] a) _ 207.17] a) _ [199.52 
282.90 |’ 292.83 |’ 300.48 


As with voter distribution, this example is oversimplified. For instance, 
we have not taken into account members of the population who move into 
or away from the metropolitan area. Nonetheless, the basic ideas of Markov 
processes are important in the study of systems whose current state depends on 
preceding ones, and we see the key role matrices and matrix multiplication play 
in representing them. 
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1.3.2 Matrix products using Maple 


After becoming comfortable with computing elementary matrix products by 
hand, it is useful to see how Maple can assist us with more complicated 
computations. Here, we demonstrate the relevant command. 

Revisiting example 1.3.2, to compute the product Ax, we first enter A and 
x using the familiar commands 


> A := <<1l, -4, 2>|<-3, 1, 5>>; x := <<-5,2>>; 


Next, we use the ‘period’ symbol to inform Maple that we want to multiply. 
Entering 


Note: Maple will obviously only perform the multiplication when it is defined. 
If, say, we were to attempt to multiply a 2 x 2 matrix and a 3 x 1 vector, Maple 
would report the following: 


Error, (in LinearAlgebra:-MatrixVectorMultiply) 
vector dimension (3) must be the same as the 
matrix column dimension (2). 


Exercises 1.3 For exercises 1-4, where a matrix A and vector x are given, 
compute the product Ax in every case that it is defined. If the product is 
undefined, explain why. 


1-3 2 -1 
LA=|_j ob x=| | 


-1 
1-3 2 
2. A=| I x= 2 
-4 10 ; 
5 —2 
3.A=| 1 -1], a= |2| 
3 2 
3 
4.A=[-4 27], x=| 5 
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5. Recall from multivariable calculus that given vectors x, y € R°, the dot 
product of x and y, x-y, is computed by taking 


XV = XY + X22 + X33 
How can matrix—vector multiplication (when defined) be viewed as the 
result of computing several appropriate dot products? Explain. 


6. For the system of equations given below, determine a vector equation with 
an equivalent solution. What is the system asking in regard to linear 
combinations of certain vectors? 

xX) + 2% =1 
xi) + 1% =0 


In addition, determine a matrix A and vector b so that the equation 
Ax = b is equivalent to the given system of equations. 


7. For the system of differential equations (1.1.6) (also given below) from the 
introductory section, how can we rewrite the system in matrix—vector 


notation? 
dx xX} x 
=20- 
dt 20 * 80 
dx xX, %X 
=35 — 
dt - 40 40 


Hint: recall that if x(t) is a vector function, we write x’(t) or dx/dt for the 
vector [dx,/dt dx /dt]". 


8. Determine if the vector b = [—3 1 5]' isa linear combination of the 
vectors a; =[—1 2 1]',a>=[3 1 1]', anda3=[1 5 3]. Ifso, will more 
than one set of weights work? 


9. Determine if the vector b= [0 7 4]' isa linear combination of the vectors 
a; =[-1 2 ee ay =[3 1 1], anda3=[1 5 3]. If so, will more than 
one set of weights work? 


10. We know from our work in this section that the matrix equation Ax = b 
corresponds both to a vector equation and a system of linear equations. 
What is the augmented matrix that represents this system of equations? 


In exercises 11-15, let A be the stated matrix and b the given vector. Solve 
the linear equation Ax = b by converting the equation to a system of linear 
equations and row-reducing appropriately. If the system has more than one 
solution, express the solution in parametric vector form. Finally, write a sentence 
in each case that explains how the vector b is related to linear combinations of 
the columns of A. 


45 -l 13 
u.a=[5 i Al b= [5 
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1 —3 5 
144.A=]-2 1], b=] —-5 

3 -1 5 

> =3 1 0 
15.A=|-2 1 4], b= 22 

1 0 -2 —ll 
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16. Linear equations of the form Ax = 0 are important for a variety of reasons, 
some of which we will study in the next section. Explain why the system of 


linear equations corresponding to the equation Ax = 0 is always 
consistent, regardless of the matrix A. 


In exercises 17-21, solve the linear equation Ax = 0 by row-reducing 
appropriately. If the system has more than one solution, express the solution in 


parametric vector form. 


> =3 1 
21.A=]-2 1 
1 0 -2 


22. Let A= ; | and b= i: i Describe the set of all vectors b for 
= 2 


which the equation Ax = b is consistent. 


23. Let v) = - | i rat and b= i | Describe the set of all 


vectors b for which b is a linear combination of v, and v2. 
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24, 


25. 


26. 


27. 
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Let A be an m x n matrix, x and y € R”, and c € R. Show that 


(a) A(x+y) = Ax+ Ay 
(b) A(cx) = c(Ax) 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) To compute the product Ax, the vector x must have the same number 
of entries as the number of rows in A. 

(b) A linear combination of three vectors in R? produces another vector in 
R>. 

(c) If b is a linear combination of v, and v2, then there exist scalars c, and 
c such that c;v] + mv. =b. 

(d) If A is a matrix and x and b are vectors such that Ax = b, then x is a 
linear combination of the columns of A. 

(e) The equation Ax = 0 can be inconsistent. 


Suppose that for a large population that stays relatively constant, people 
are classified as living in urban, suburban, or rural settings. Moreover, 
assume that the probabilities of the various possible transitions are given 
by the following table: 


Future location ({)/current location (>) | U(%) | S(%) | R(%) 


Urban 92 3 2 
Suburban 7 96 10 
Rural 1 1 88 


Given that the population of 250 million is initially distributed in 100 
million urban, 100 million suburban, and fifty million rural, predict the 
population distribution in each of the following five years. 


Car-owners can be grouped into classes based on the vehicles they own. A 
study of owners of sedans, minivans, and sport utility vehicles shows that 
the likelihood that an owner of one of these automobiles will replace it 
with another of the same or different type is given by the table 


Future vehicle (|)/ | Sedan(%) | Minivan(%) | SUV(%) 
current vehicle (>) 


Sedan 91 3 2 


Minivan 7 95 8 


SUV 2 2 90 


The span of a set of vectors 33 


If there are currently 100 000 sedans, 60 000 minivans, and 80 000 SUVs 
among the owners being studied, predict the distribution of vehicles 
among the population after each owner has replaced her vehicle 3 times. 


1.4 The span of a set of vectors 


In section 1.3, we saw that the question “is b a linear combination of a; and 
a2?” provides an important new perspective on solutions of linear systems of 
equations. It is natural to slightly rephrase this question and ask more generally 
“which vectors b may be written as linear combinations of a; and a2?” We 
explore this question further through the following sequence of examples. 


Example 1.4.1 Describe the set of all vectors in R? that may be written as a 
linear combination of the vector a; = [2 1]!. 


Solution. Since we have just one vector a1, any linear combination of a; has 
the form ca;, which of course is a scalar multiple of aj. Geometrically, the 
vectors that are linear combinations of a; are stretches of a;, which lie on the 
line through (0, 0) in the direction of a;, as shown in figure 1.8. 


In this first example, we see a visual way to interpret the question about 
linear combinations: essentially we want to know “which vectors can we create 
using only linear combinations of a;?” The answer is not surprising: only vectors 
that lie on the line through the origin in the direction of a,. 

Next, we consider how the situation changes when we consider two parallel 
vectors. 


—4 


Figure 1.8 The set of all linear combinations of 
a; in example 1.4.1. 
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Example 1.4.2 Describe the set of all vectors in R? that may be written as a 
linear combination of the vectors a; = [2 1]' anday =[-1 — 5. 


Solution. Observe first that —jay = ap. Here we are considering the set of all 


vectors y of the form 
2 —l 
y=ali]+al oi 
2 


In figure 1.9, we observe that the vectors a; and a2 point in opposing directions. 
When we take a linear combination of these vectors to form y, we are adding 
a stretch of c, units of the first to a stretch of c units of the second. Because 
the two directions are parallel, this leaves the resulting vector as a stretch of 
one of the two original vectors, and therefore on the line through the origin 
in their direction. This may also be seen algebraically since — Say = a, implies 


1 1 
Y = a1 + Man = Ca — 5a) = (C1 — 7) ay. 


We note particularly that since the two given vectors a; and az are parallel, 
any linear combination of them is actually a scalar multiple of a;. Thus, the 
resulting set of all linear combinations is identical to what we found with the 
single vector given in example 1.4.1. 

Finally, we consider the situation where we consider all linear combinations 
of two non-parallel vectors. 


—4-} 


Figure 1.9 The set of all linear combinations of 
a; and a2 in example 1.4.2. 


Example 1.4.3. Describe the set of all vectors in R* that may be written as a 
linear combination of the vectors a; =[2 1]! anda) =[1 2]!. 
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7/3. a,— 5/3 a, 


—4- 


Figure 1.10 Linear combinations of a; and a 
from example 1.4.3. 


Solution. Algebraically, we are again considering the set of all vectors y such 
that y = cya; + a2. A visual way to think about how the set of all such vectors y 
looks is found in the question, “which vectors can we create by taking a stretch 
of a; and adding this to a stretch of a2?” 

If we consider a plot of the given two vectors a; and az and think of the 
“grid” that is formed by considering all of their stretches and the sums of their 
stretches, we have the picture shown in figure 1.10. The fact that a; and a2 are 
not parallel enables us to “get off the line” that each one generates through the 
origin. For example, if we simply take the sum of these two vectors and set 
y =a, + a), by the parallelogram law of vector addition we arrive at the new 
vector [3 3]! shown in figure 1.10. Two other linear combinations are shown 
as well, and from here it is not hard to visualize the fact that we can create 
any vector in the plane using linear combinations of the non-parallel vectors a, 
and ap. In other words, the set of all linear combinations of a, and a is R?. 


It is also possible to verify our findings in example 1.4.3 algebraically. We 
will explore this further in the exercises and in section 1.5. 

Certainly we are not limited to considering linear combinations of only two 
vectors. We therefore introduce a more formal perspective and terminology to 
describe the phenomena examined in the above examples. 


Definition 1.4.1 Given a set of vectors S = {vj,..., vz}, vi € R™, the span of S, 
denoted Span(S) or Span{vi, ..., vz}, is the set of all linear combinations of the 
vectors V,,..., Vx. Equivalently, Span(S) is the set of all vectors y of the form 


Y= C11 +++ + CKVK 
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where c},..., c, are scalars. We also say that Span(S) is the subset of R” spanned 
by the vectors V1, ..., Vk. 


For any single nonzero vector v; € R™, Span{v;} consists of all vectors that 
lie on the line through the origin in R” in the direction of v;. For two non- 
parallel vectors vj, v2 € R™, Span{v,, v2} is the plane through the origin that 
contains both the vectors v, and v2. 

Next, let us recall that our interest in linear combinations was motivated by 
a desire to look at systems of linear equations from a new perspective. How is 
the concept of span related to linear systems? We begin to answer this question 
by considering the special situation where b = 0. 

A system of linear equations that can be represented in matrix form by 
the equation Ax = 0 is said to be homogeneous; the case when b ¥ 0 is termed 
nonhomogeneous. We also call the equation Ax = 0 a homogeneous equation. 
By the definition of matrix—vector multiplication, it is immediately clear that 
AO = 0 (note that these two zero vectors may be of different sizes), and thus 
any homogeneous equation has at least one solution and is guaranteed to be 
consistent. We will usually call the solution x = 0 the trivial solution. Under 
what circumstances will a homogeneous system have nontrivial solutions? How 
is this question related to the span of a set of vectors? The following example 
provides insight into these questions. 


Example 1.4.4 Solve the homogeneous system of linear equations given by the 
equation Ax = 0 where A is the matrix 


Di: 
21-1 3 
= 2 

1 


85 -1 1 
If more than one solution exists, express the solution in parametric vector form. 
Solution. To begin, we augment the matrix A with a column of zeros to 


represent the vector 0 in the system given by Ax = 0. We then row-reduce this 
augmented matrix to find 


1 1 1 1 0 10-2 2 0 
21-1 30 = 01 3 —-1 0 
10-2 2 0 00 0 0 0 
8 5 —1 11 0 00 0 0O 0 


We observe that the system has two free variables, and therefore infinitely many 
solutions. In particular, these solutions must satisfy the equations 


x, — 2x3 +2x4,=0 


xX) + 3x3 -— x4 = 0 
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where x3 and xy are free. Equivalently, using these equations and vector addition 
and scalar multiplication, it must be the case that any solution x to Ax = 0 has 
the form 


eal 2x3 — 2x4 2 —2 

— | % | | 3x3 + xq | —3 1 
x= ee ‘. = x3 1 + x4 0 (1.4.1) 

X4 X4 1 


where x3, x4 € R. Note particularly that this shows that every solution x to the 
original homogeneous equation Ax = 0 can be expressed as a linear combination 
of the two vectors on the rightmost side of (1.4.1). Moreover, it is also the case 
that every linear combination of these two vectors is a solution to the equation. 
In light of the terminology of span, we can say that the set of all solutions to the 
homogeneous equation Ax = 0 is Span{vj, v2}, where 


[3] ../ 3 


In this section, we have seen that the set of all linear combinations of a 
set of vectors can be interpreted geometrically, particularly in the case when 
we only have one or two vectors present, by thinking about lines and planes. In 
addition, the span ofa set of vectors arises naturally in considering homogeneous 
equations in which infinitely many solutions are present. In that situation, the 
set of all solutions can be expressed as the span of a set of k vectors, where 
k is the number of free variables that arise in row-reducing the augmented 
matrix. 


Exercises 1.4 In exercises 1-6, solve the homogeneous equation Ax = 0, 
given the matrix A. If infinitely many solutions exist, express the solution set as 
the span of the smallest possible set of vectors. 


1 -3 2 
anf 2 J 
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3 1 —- 
A= 1 3 
-1 1 
1 -l 2 
A= 4 -2 6 
—7 3 —10 


. Let A be an m x n matrix where n > m. Is it possible that Ax = 0 has only 
the trivial solution? Explain why or why not. 


. Let A be an m x n matrix where n < m. Is it guaranteed that Ax = 0 will 
have only the trivial solution? Explain why or why not. 


. Determine if the vector b=[11 —4]' isin the span of the vectors 
a; =[3 —2]! anda) =[—9 6]!. Justify your answer carefully. 


Determine if the vector b =[—17 31]! is in the span of the vectors 
a; =[1 0]! anda) =[0 1]". What do you observe? 


. Determine if the vector b =[9 17 11] is in the span of the vectors 
a; =[—1 2 1]!,a2=[3 1 1]', anda3=[1 5 3]!. Justify your answer. 


Explain why the vector b = [3 2]! does not lie in the span of the set S, 
where S = {v} andv=[1 1]'. 


Describe geometrically the set W = Span{v), v2}, where v; = [1 1 1]! and 
v2 =[-3 0 2)". 


Can every vector b € R? be found in W = Span{v1, v2}, where 
vj =[1 1 1]' and v2 = [—3 0 2]'? If so, explain why. If not, find a vector 
not in W and justify your answer. 


Show that every point (vector) that lies on the line with equation 
2x, — 3X2 = 0 also lies in the set W = Span{v }, where v; = [3 ai"; 


Show that every point (vector) that lies on the plane with equation 
—x+y+z= 0also lies in the set W = Span{v,, v2}, where 
vi =[1 —1 2]! andv) =[2 1 1]". 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) The span of a single nonzero vector in R* can be thought of as a line 
through the origin. 

(b) The span of any two nonzero vectors in R> can be viewed as a plane 
through the origin in R>. 

(c) If Ax = b holds true for a given matrix A and vectors x and b, then x lies 
in the span of the columns of A. 
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(d) It is possible for a homogeneous equation Ax = 0 to be inconsistent. 
(e) The number of free variables present in the solution to Ax = 0 is the 
same as the number of pivot columns in the matrix A. 


1.5 Systems of linear equations revisited 


From our initial work with row-reducing a system of linear equations to our 
recent discussions of linear combinations and span, we have seen already that 
there are several perspectives from which to view a system of linear equations. 
One is purely algebraic: “is there at least one ordered list (x,,..., x) that makes 
every equation in a given system true?” Here we are viewing the system in 
the form 


Ay X + ay2Xy +++ + AinXn = by 


21 X1 + ag2X2 ++++ + aynXn = by 


Ami X1 + Am2X2 + +++ + AmnXn = bin 


In light of linear combinations, we can rephrase this question geometrically as 
“is the vector b a linear combination of the vectors a),...,a,?”, where a; is the 
ith column of the coefficient matrix of the system. From this standpoint, asking 
if the system has a solution can be thought of in terms of the question, “does 
the vector b belong to the span of the columns of A?” Finally, through matrix 
multiplication, we can also express this system of equations in its simplest form: 
Ax = b. From all of this, we know that the question, “Does Ax = b have at least 
one solution?” is one of fundamental importance. 

We have also seen that in the special case of the homogeneous equation 
Ax = 0, the answer to the above questions is always affirmative, since setting 
x = 0 guarantees that we have at least one solution. In what follows, we 
further explore the nonhomogeneous case Ax = b, with particular emphasis 
on understanding characteristics of the matrix A that enable us to answer the 
questions in the preceding paragraph. 

We begin by revisiting example 1.4.2 from a more algebraic perspective. 


Example 1.5.1 For which vectors b is the equation Ax = b consistent, if A is 
the matrix whose columns are the vectors a} =[2 1]! anda, =[-1 — s\t 

Solution. By the definition of matrix multiplication, this question is equivalent 
to asking, “which vectors b are linear combinations of the columns of A?” This 
question may be equivalently rephrased as “which vectors b are in the span of 
the columns of A?” We have already answered this question from a geometric 
perspective in example 1.4.2, where we saw that since a; and aj are parallel, 
it follows that every vector in R* that lies on the line through the origin in 
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the direction of a; can be written as a linear combination of the two vectors. 
Nonetheless, it is insightful to explore algebraically why this is the case. 


Letting b be the vector whose entries are b; and 2 and writing the equation 
Ax = b in the form of an augmented matrix, we row-reduce and find that 


2 -1 by 1-3 by 
F = mae 0 bj —2by 


The second row in the augmented matrix represents the equation 
0x) + 0x2 = b) — 2bn 


Observe that if b} — 2b. 4 0, this equation cannot possibly be true, and therefore 
the system would be inconsistent. Said differently, the only way for Ax = b to 
be consistent is for b} — 2by = 0. That is, if b is a vector such that b) = 2ln, or 


_ 2b, 
|" 


then Ax = b is consistent. This makes sense geometrically, since the span of the 
columns of A is all the stretches of the vector a, = [2 1]!. 


An important lesson to take from example 1.5.1 is that the equation Ax = b 
discussed there is not consistent for every choice of b. In fact, the equation is only 
consistent for very limited choices of b. For example, if b = [6 3]', the equation 
is consistent, but if b= [6 k]' for any k 4 3, the equation is inconsistent. 
Moreover, we should observe that for the matrix in this example, A does not 
have a pivot position in every row. This is what ultimately leads to the algebraic 
equation 0x; + 0x2 = b; — 2b», and the potential inconsistency of Ax = b. 

At this point in our work, it is important that we begin to generalize our 
observations in order to apply them in new, but similar, circumstances. We 
again emphasize that it is a noteworthy characteristic of linear algebra that the 
discipline often offers great flexibility through the large number of ways to say 
the same thing; at times, one way of stating a fact can give more insight than 
others, and therefore it is important to be well versed in shifting among multiple 
perspectives. The following theorem is of the form “the following statements are 
equivalent”; this means that if any one of the statements is true, all the others are 
as well. Likewise, if any one statement is false, every statement in the theorem 
must be false. 

This theorem formalizes our findings in the example above, and, in some 
sense, our work in the first several sections of the text. 


Theorem 1.5.1 Let A be an m x n matrix and b a vector in R”™ so that the 
equation Ax = b represents a system of m linear equations in n unknown 
variables. The following statements are equivalent: 


a. The equation Ax = b is consistent 


b. The vector b is a linear combination of the columns of A 
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c. The vector b is in the span of the columns of A 


d. When the augmented matrix [A b] is row-reduced, there are no rows 
where the first entries are zero and the last entry is nonzero. 


The following example demonstrates how we can use theorem 1.5.1 to 
answer questions about span and linear combinations. 


Example 1.5.2 Does the vector b=[1 —7 — 14]' belong to the span of the 
vectors a; = [1 3 4]',a2=[2 1 —1]', anda3=[0 5 9]'? Does the result 
change if we ask the same question about the vectore =[1 —7 —13]"? 


Solution. By theorem 1.5.1, we know that it is equivalent to ask if the equation 
Ax = b is consistent, where b is the given vector and A is the matrix whose 
columns are a1, a2, and a3. To answer that question, we consider the augmented 
matrix [A | b] and row-reduce: 


1 20 1 1 0 —3 
3 15 -7}|—~);0 1-1 2 
4 -1 9 —14 00 0 O 


Because this system of equations is consistent, it follows that b is indeed a linear 
combination of the columns of A and therefore b lies in the span of aj, a2, 
and a3. 

If we instead consider the vector c stated in the example and proceed 
similarly, row-reduction shows that 


12 0 10 20 
3 15 -7|}—~/]0 1 -1 0 
419 00 01 


which implies that the system is inconsistent and therefore c is not a linear 
combination of the columns of A, or equivalently, c does not lie in the span of 
aj, a, and a3. 


At this point, it is natural to think the situations in examples 1.5.1 and 1.5.2 
are somewhat dissatisfying: sometimes Ax = b is consistent, and sometimes not, 
all depending on our choice of b. A natural question to ask is, “are there matrices 
A for which Ax = b is consistent for every choice of b?” With that question, we 
are certainly interested in the properties of the matrix A that make this situation 
occur. We next revisit example 1.4.3 and explore these issues further. 


Example 1.5.3. For which vectors b is the equation Ax = b consistent, if A is 
the matrix whose columns are the vectors aj} = [2 1]! anda) =[1 2]'? 
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Solution. Proceeding as in the previous example, we row reduce the 
augmented matrix form of the equation and find that 


? 1 I 10 3bi— 3b 

12 bh 01 —1b+2b 

Algebraically, this shows that regardless of the entries we select for the vector 
b, we can always find a solution to the equation Ax = b. In particular, x is 
the vector in R* whose components are x; = 2b, _ iby and x) = —ib + 2 by. 
Thus the equation Ax = b is consistent for every b in R*. Note that this is 
not surprising, given our work in example 1.4.3, where we found that from 
a geometric perspective, every vector b € R? could be written as a linear 


combination of a; and a. This example simply confirms that finding, but now 
from an algebraic point of view. 


In terms of a key property of the matrix in example 1.5.3, we see that A has 
a pivot position in every row. In particular, there is no row in RREF(A) where 
we encounter all zeros, and thus it is impossible to ever encounter an equation 
of the form 0 = k, where k 4 0. This is, therefore, one property of the matrix A 
that guarantees consistency for every choice of b. 

We generalize our findings in this example in the following theorem, which 
is similar to theorem 1.5.1, but now focuses solely the matrix A and no longer 
requires a vector b to be initially chosen. 


Theorem 1.5.2 Let A be an mx n matrix. The following statements are 
equivalent: 

a. The equation Ax = b is consistent for every b € R” 

b. Every vector b € R” is a linear combination of the columns of A 

c. The span of the columns of A is R” 

d. A has a pivot position in every row. That is, when the matrix A is 


row-reduced, there are no rows of all zeros. 


Our next example shows how we can apply theorem 1.5.2 to answer general 
questions about the span ofa set of vectors and the consistency of related systems 
of equations. 


Example 1.5.4 Does the vector b=[1 —7 — 13] belong to the span of the 
vectors aj = [1 3 4y', ag=[21 — iy, anda3=[0 5 10]!? Can every vector 
in R® be found in the span of the vectors aj, a2, and a3? 


Solution. Just as in example 1.5.2, we know by theorem 1.5.1 that it is 
equivalent to ask if the equation Ax = b is consistent, where b is the given 
vector and A is the matrix whose columns are aj, a2, and a3. We thus consider 
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the augmented matrix [A | b] and row-reduce: 


1 2 0 1 100 —5 
3 1 5 -7}—+1]0 1 0 
4 —-1 10 —13 00 1 1 


Because this system of equations is consistent, it follows that b is indeed a linear 
combination of the columns of A and therefore b lies in the span of a;, a2, and 
a3. But by theorem 1.5.2 we can now make a much more general observation. 
Because we see that the coefficient matrix A has a pivot in every row, it follows 
that regardless of which vector b we choose in R*, we can write that vector 
as a linear combination of the columns of A. That is, the vectors a), a2, and 
a3 span all of R° and the equation Ax = b will be consistent for every choice 
of b. 


This example demonstrates that it is in some sense ideal if a matrix A has 
a pivot in every row. As we proceed with further study of linear algebra, we 
will focus more and more on properties of the coefficient matrix and their 
implications for related systems of equations. We conclude this section by 
examining a key link between homogeneous and nonhomogeneous equations 
in order to foreshadow an essential concept in our pending study of differential 
equations. 


Example 1.5.5 Solve the nonhomogeneous system of linear equations given 
by the equation Ax = b where A and b are 


1 1 1 1 1 
2 1-1 3 —8 
10-2 2)’ De -—9 
8 5 —-1 11 —22 


If more than one solution exists, express the solution in parametric vector form. 


Solution. Note that the coefficient matrix A is identical to the one in 
example 1.4.4, so that here we are simply considering a related nonhomogeneous 
equation. We augment the matrix A with b and then row reduce to find 


11 1 1 1 10-2 2 -9 
21-1 3 -8 _ 01 3 -1 10 
10-2 2 -9 00 0 0 0 
8 5 —1 11 —22 00 0 0 O 
As we found with the homogeneous equation, the system is consistent and has 


two free variables, and therefore infinitely many solutions. These solutions must 
satisfy the equations 


xy = —9+ 2x3 — 2x4 


xX = 10 —3x34+ x4 
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where x3 and x4 are free. Equivalently, it must be the case that any solution x 
has the form 


| toma | re 7 

_ | x] 10—3x3+x4 |__| 10 —3 1 

= x3 ~ x3 ~ 0 1% 1 re 0 
xs | | X4 | 0 0 | 1 | 


where x3, x4 € IR. Observe that if we let x» =[—9 10 0 O}' and let x;, be any 


vector of the form 
= | | | 
3 1 
xX,=Ft 


Pal Lt 


then any solution to the equation Ax = b has the form x = xp + x;. Moreover, 
it is now apparent that this vector xj, is the same general solution vector that 
we found for the corresponding homogeneous equation in example 1.4.4. In 
addition, it is straightforward to check that Axp = b. Thus, we see that the general 
solution to the nonhomogeneous equation contains the general solution to the 
corresponding homogeneous equation. 


It appears from example 1.5.5 that if we have a solution, say Xp, to 
a nonhomogeneous equation Ax = b, we may add any solution x, to the 
homogeneous equation Ax = 0 to xp and still have a solution to Ax = b. To see 
why any vector of the form xp + Xj is a solution to Ax = b, let us assume that xp 
is a solution to Ax = b, and x; is a solution to Ax = 0. We claim that x = xp + xp 
is also a solution to Ax = b. This holds since 


Ax = A(Xp + Xp) 
= Ax, + Ax;, 
=b+0 
=b (1.5.1) 


Clearly, this shows that the solution to the corresponding homogeneous 
equation plays a central role in the solution of nonhomogeneous equations. 
One observation we can make is that in the event we can find a single particular 
solution xp to the nonhomogeneous equation, if the corresponding homoge- 
neous equation has at least one free variable, then we know that there must be 
infinitely many solutions to the nonhomogeneous equation as well. We could 
even take the perspective that, in order to solve anonhomogeneous equation, we 
simply need to do two things: find one particular solution to Ax = b, and then 
combine that particular solution with the general solution to the corresponding 
homogeneous equation Ax = 0. While this is not so useful with systems of linear 
algebraic equations, it turns out that this approach of solving the homogeneous 
equation first is essential in the solution of differential equations. 
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The following example shows how the same structure is present in a class 
of differential equations that we will discuss in detail in section 2.3. 


Example 1.5.6 Consider the differential equations y’ + 3y =O and y’+3y =6. 
Compare and contrast the solutions to these two equations. 


Solution. The first equation, y’ + 3y = 0, we will call a homogeneous linear 
first-order differential equation. Note that it asks a straightforward question: 
what function y(t) is such that the function’s derivative plus 3 times itself is the 
zero function? Said differently, we seek a function y such that y’ = —3y. From 


our experience with exponential functions in calculus, we know that if y = e~*", 


then y’ = —3e~*". The same is true for functions like y = 2e~*! and y = —5e~*'; 
indeed, we see that for any constant C, the function y = Ce~*" satisfies the 
differential equation. (It also turns out that these are the only functions that 
satisfy the differential equation.) 

If we next consider the related differential equation y’ + 3y = 6 — one that 
we will call a nonhomogeneous linear first-order differential equation—we see that 
there is one obvious solution to the equation. In particular, if we let y(t) be the 
constant function y(t) = 2, then y(t) = 0 and this function clearly makes the 
differential equation true since 3 x 2=6. 

Now, we should wonder if we have found all of the possible solutions to 
y’ + 3y = 6. The answer is no: as we will see in section 2.3, it turns out that the 
general solution y to this differential equation is 


y(t) =2+ te" 


We can verify that this is the case by direct substitution. Note that y’ = —3Ce~>! 
and therefore 


y' +3y =—3Ce~** +. 3(24+ Ce") = —3Ce* + 6 4+. 3Ce#* = 6 


Observe the structure of this solution function: if we let Vp = 2, we have a 
particular solution to the nonhomogeneous equation. Further, letting y, = 
Ce~*", this is the general solution to the related homogeneous equation. This 
demonstrates that the overall solution to the nonhomogeneous equation is 


Y=Yp+Yn=2+Ce™ 


Exercises 1.5 For each of the following m x n matrices A in exercises 1-8, 
determine whether the equation Ax = b is consistent for every choice of b € R™. 
If not, describe the set of all b € R” for which the equation is consistent. In each 
case, explain your reasoning fully. 


4 -1 
vee 


4 -l] 
rae[ tt] 
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2 1 
4.A=]-1 3 
4 -2 
1 5 —-2 
5. A= 2 -1 7 
—3 4 —14 
1 5 —-2 
6.A=|] 2 -1 7 
—3 4 —13 
10 | 
010 
oe 001 
0 0 | 
100 2 
8. A=|]0 10 5 
001 -3 
9. If Ais an m x n matrix and m > n, is it possible for the equation Ax = b to 


be consistent for every b € R™? Explain. 


10. If Ais an m x n matrix and m < n, is it guaranteed that the equation 
Ax = b will be consistent for every b € R”? Explain. 


In each of exercises 11-16, determine whether the given vector b is in the span 
of the columns of the given matrix A. If b lies in the span of the columns of A, 
determine weights that enable you to explicitly write b as a linear combination 
of the columns of A. 
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—4 1 5. 2. 
16.b=] —-2], A= 2 -1 7 
1 —3 4 —13 


For each matrix A given in exercises 17-21, determine the general solution x; 
to the homogeneous equation Ax = 0. 


1-3 2 
W.A=[_4 | 


1 2 0O 1 
18. A= | 3 1 5S =7 
4 —1 10 —13 


—5 8 
ian 10 4a 


3 1-1 
20. A= 1 3 1 
-1 1 3 
1 -l 2 
21.A= 4 -2 6 
—7 3 —10 


In exercises 22—26, solve the nonhomogeneous equation Ax = b, given the 
matrix A and vector b. Express your solution x (if one exists) in the form 
X = Xp + Xp, where Xp is a particular solution to Ax = b and x;, is the solution 


to the corresponding homogeneous equation Ax = 0. Compare your results to 
exercises 17—21, respectively. 


1-3 2 5 
2.a=| ‘ a b=|_5| 


1 2 0O 1 1 
23. A= | 3 1 5 -7|, b=] 3 
5 
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Suppose that A is a 6 x 9 matrix that has a pivot in every row. What can 
you say about the consistency of Ax = b for every b € R°? Why? 


Suppose that A is a 3 x 4 matrix and that the span of the columns of A is 
R?. What can you say about the consistency of Ax = b for every b € R*? 
Why? 


If possible, give an example of a 3 x 2 matrix A such that the span of the 
columns of A is R°. If finding such a matrix is impossible, explain why. 


Suppose that A is a 4 x 3 matrix for which the homogeneous equation 

Ax = 0 has only the trivial solution. Will the equation Ax = b be 
consistent for every b € R*? Explain. For the vectors b for which Ax = b is 
indeed a consistent equation, how many solution vectors x does each 
equation have? Why? 


. Suppose that A is a 3 x 4 matrix for which the homogeneous equation 


Ax = 0 has exactly one free variable present. Will the equation Ax = b be 
consistent for every b € R*? Explain. For the vectors b for which Ax = b is 
indeed a consistent equation, how many solution vectors x does each 
equation have? Why? 


Suppose that A is a 4 x 5 matrix for which the homogeneous equation 

Ax = 0 has exactly two free variables present. Will the equation Ax = b be 
consistent for every b € R*? Explain. For the vectors b for which Ax = b is 
indeed a consistent equation, how many solution vectors x does each 
equation have? Why? 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If Ax = b is consistent for at least one vector b, then A has a pivot in 
every row. 

(b) If A is a 4 x 3 matrix, then it is possible for the columns of A to span R*. 

(c) If Ais a3 x 3 matrix with exactly two pivot columns, then the columns 
of A do not span R?. 

(d) If Ais a3 x 4 matrix, then the columns of A must span R?. 

(e) Ify and z are solutions to the equation Ax = 0, then the vector y + z is 
also a solution to Ax = 0. 

(f) Ify and z are solutions to the equation Ax = b, where b 4 0, then the 
vector y +z is also a solution to Ax = b. 


Solve the linear first-order differential equation y’ + y = 3 by first finding 
all functions y; that satisfy the homogeneous equation y’ + y = 0 and then 
determining a constant function yp that is a solution to y’ + y = 3. Verify 
by direct substitution that y = yp, + yp is a solution to the given equation. 


Solve the linear first-order differential equation y’ — 5y = 6 by first finding 
all functions y; that satisfy the homogeneous equation y’ — 5y = 0 and 
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then determining a constant function yp that is a solution to y’ — 5y = 6. 
Verify by direct substitution that y = yp + yp is a solution to the given 
equation. 


1.6 Linear independence 


In theorem 1.5.2, we found that when solving Ax = b, an ideal situation occurs 
when A has a pivot position in every row. Equivalently, this means that the 
equation Ax = b is guaranteed to have at least one solution for every vector 
b € R™” (when A is m x n), or that every b € R” can be written as a linear 
combination of the columns of A. In other words, regardless of the choice of b, 
the equation Ax = b is always consistent. Because the equation is consistent, we 
are guaranteed that at least one solution x exists. In what follows, we explore 
conditions that imply not only that at least one solution exists, but in fact that 
only one solution exists. First, we consider the simpler situation of homogeneous 
equations. 

In section 1.4, we discovered that the equation Ax = 0 is always consistent. 
Because x = 0 always makes this equation true, we know that we at least have 
the trivial solution present. It is natural to ask: under what conditions on A 
is the trivial solution the only solution to the homogeneous equation Ax = 0? 
Geometrically, we are asking whether or not a nontrivial linear combination of 
the columns of A can be formed that leads to the zero vector. 

We revisit an earlier example to further explore these issues. 


Example 1.6.1 Does the equation Ax = 0 have nontrivial solutions if A is the 
matrix whose columns are a; = [2 1]! anda, =[-1 — 5]? Discuss the 
geometric implications of your conclusions. 


Solution. We first consider the corresponding augmented matrix and row 
reduce, finding that 
2-10 1-4} 0 
‘ > 
1 =; 0 0 00 
This shows that any vector x = [x x)|" that satisfies x, = 5X) will be a solution 
to Ax = 0. The presence of the free variable x2 implies that there are infinitely 
many nontrivial solutions to this equation. 
If we interpret the matrix-vector product Ax as the linear combination 
Ax = x1 a; + x2a2, then the equation 
1 
ze +xa2 =0 


implies geometrically that the zero vector (on the right) may be expressed 
as a nontrivial linear combination of a; and a. For example, a; + 2a2 = 0. 
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—4 


Figure 1.11 Linear combinations of a; and aj 
from example 1.6.1. 


Indeed, if we consider figure 1.11 this conclusion is evident: if we add one 
length of a; to two lengths of az, we end up at 0. 


Another way to express the equation a; + 2a = 0 is to write a; = —2a). In 
this setting, we can see that a; depends on az, and that the relationship is given 
by a linear equation. We hence say that a; and az are linearly dependent vectors. 

The situation in example 1.6.1, where the vectors a, and az are parallel is in 
contrast to that of example 1.4.3, where we instead considered the non-parallel 
vectors a; =[2 1]! anda) =[1 2]'; in that setting, if we solve the associated 
homogeneous equation Ax = 0, we find that 


210); _,;1 00 
12 0 0 10 


In this case, the only solution to Ax = 0 is the trivial solution, x = 0. The 
geometry of the situation also informs us: if we desire a linear combination of 
the vectors a; and a2 (as shown in figure 1.12) that results in the zero vector, we 
see that the only way to accomplish this is to take 0a; + Oap. Said differently, if 
we take any nontrivial linear combination c)a; + c2a2, we end up at a location 
other than the origin. 

When a, and a2 in example 1.6.1 were parallel, we said that a; and a2 were 
linearly dependent. In the current context, where a; and az are not parallel, it 
makes sense to say that a; and ag are linearly independent, since neither depends 
on the other. 

Of course, in linear algebra we often consider sets of more than two vectors. 
The next definition formalizes what the terms linearly dependent and linearly 
independent mean in a more general context. Observe that the key criterion is 
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Figure 1.12 Linear combinations of a; and aj 
from example 1.4.3. 


a geometric one: can we form a nontrivial linear combination of vectors that 
results in 0? 


Definition 1.6.1 Given a set S = {v),..., vy} where each vector v; € R™, the 
set S is linearly dependent if there exists a nontrivial solution x to the vector 
equation 


X1V1_ + 29V2+---+ xv, = 0 (1.6.1) 


If (1.6.1) has only the trivial solution, then we say the set S is linearly 
independent. 


Note that (1.6.1) also takes us back to the fundamental questions about 
any linear system of equations: “does at least one solution exist?” (Yes; the zero 
vector is always a solution.) And “is that solution unique?” (Maybe; only if 
the vectors are linearly independent and the zero vector is the only solution.) 
The latter question addresses the fundamental issue of linear independence. We 
consider an example to demonstrate how we interpret the language of this most 
recent definition as well as how we will generally respond to the question of 
whether or not a set of vectors is linearly independent. 


Example 1.6.2 Determine whether the set S = {v),v2,v3} is linearly 
independent or linearly dependent if 


| 
a 
a) 


52 Essentials of linear algebra 


Solution. By definition, the linear independence of the set S rests on whether 
or not nontrivial solutions exist to the vector equation x1 v1 + x2V2 + +3Vv3 = 0. 
Letting A = [Vv v2 v3], we know that this question is equivalent to determining 
whether or not Ax = 0 has a nontrivial solution. Considering the augmented 
matrix [A 0] and row-reducing, we find 


1 0 
1 1 (1.6.2) 
=] 1 


ooo 
oro 


0 
0 
1 


ooo 


1 
—>|0 
0 


It follows that Ax = 0 has only the trivial solution, and therefore the set S is 
linearly independent. Geometrically, this means that if we take any nontrivial 
combination of vj, v2, and v3, the result is a vector that is not the zero vector. 


From example 1.6.2, we see how we will normally test a set of vectors for linear 
independence: we take advantage of our understanding of linear combinations 
and matrix multiplication and convert the vector equation x1 v1 + x2V2 +---+ 
X,V_ = 0 to the matrix equation Ax = 0, where A is the matrix with columns 
V1,..., Vx. Row-reducing, we can test whether or not nontrivial solutions exist 
to Ax = 0 by examining pivot locations in the matrix A. 

Several facts about linear dependence and independence will prove to be 
useful in many aspects of our upcoming work. We simply state them here, and 
leave their verification to the exercises at the end of this section: 


+ Any set containing the zero vector is linearly dependent. 
+ Any set {vi} consisting of a single nonzero vector is linearly independent. 


+ Any set of two vectors {vj, V2} is linearly independent whenever vj is not a 
scalar multiple of v2. 


* The columns of a matrix A are linearly independent if and only if the 
equation Ax = 0 has only the trivial solution. 


The concepts of linear independence and span both involve linear 
combinations of a set of vectors. Furthermore, there are many important and 
natural connections between span and linear independence. The next example 
extends the previous one and lays the foundation for a discussion of several 
general results. 


Example 1.6.3 Let the vectors v1, v2, v3, and v4 be given by 


—1 0 5 
y= 1], w= O}, vwa=J]1], w= 6 
—1 1 1 —1 


Let R= {v1, v2}, S = {v1, v2, v3}, and T = {vj, v2, v3, va}. Which of the sets R, 
S, and T are linearly independent? Which of the sets R, S, and T span R*? 
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Solution. We have already seen in example 1.6.2 that the set S is linearly 
independent. Moreover, we saw that when we let A = [v v2 v3] and row-reduce 
the augmented matrix for the equation Ax = 0, it follows that 


1 0 
1 1 
—1 1 


ooo 
oro 


0 
0 
1 


ooo 


1 
—>|0 
0 


Not only does this show that the vectors in set S are linearly independent (Ax = 0 
has only the trivial solution because A has a pivot in every column so there are 
no free variables present), but also, by theorem 1.5.2, the vectors in S span R? 
since A has a pivot in every row. Since the vectors in S span R?, this means that 
we can write every vector in R? as a linear combination of the three vectors in S. 
Moreover, since A has a pivot in every column, it will also follow that every such 
linear combination is unique: every vector in R? can be written in exactly one 
way as a linear combination of v;, v2, and v3. 

What happens if we remove v3 from S and instead consider the set R = 
{v1, v2}? To answer the question of linear independence, we ask if there is a 
nontrivial solution to the vector equation xv, + x2v2 = 0. Equivalently, we let 
B be the 3 x 2 matrix whose columns are v; and v2 and solve Bx = 0. Doing so, 
we find that 


so only the trivial solution exists and thus the set R is linearly independent. Note 
again that this is due to the fact that B has a pivot in every column. This should 
not be surprising, since we removed a vector from the linearly independent set S 
to get the set R: ifthe vectors in S do not depend on one another, neither should 
the vectors in R. 

On the other hand, we can also say by theorem 1.5.2 that the set R does not 
span R3, since B does not have a pivot position in every row. For example, the 
vector b= [0 1 yt cannot be written as a linear combination of v) and vo. 
This can be seen by row-reducing the augmented matrix that represents Bx = b, 
where we find that 


The last equation tells us that 0x, + 0x, = 1, which is impossible, and thus b 
cannot be written as a linear combination of the vectors in R. 

Finally, we consider the set T = {vi, V2, v3, v4}. To test if T is linearly 
independent, we let C be the matrix whose columns are vj, v2, v3, and v4, 
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and consider the equation Cx = 0, which corresponds to the equation x;v; + 
X2V2 + x3V3 + x4Vv4 = 0. Row-reducing, 


1-10 50 100 2 0 
1 O01 6 0/~+);0 10 —3 0 
—1 1 1-1 0 001 4 0 


Note that the variable x4 is free, since C does not have a pivot in its fourth 
column. This shows that any vector x with entries x1, x2, x3, and x4 such that 
xX, = —2X4, X2 = 3x4, and x3 = —4x, will be a solution to the equation Cx = 0. 
For example, taking x4 = 1, it follows that 


1 —1 0 5 0 
—2 1|+3 0|;—4; 1/41 6);=] 0 
—1 1 1 —1 0 


Thus, the set T is linearly dependent. We can also see from our computations 
that the set T’ does indeed span R?, since the matrix C has a pivot position in 
every row. This result should be expected: we have already shown that every 
vector in R? can be written as a linear combination of the vectors in S, and the 
set T contains all three vectors in S. 


There are many important generalizations we can make from example 1.6.3. 
For instance, from an algebraic perspective we see that we can easily answer 
questions about the linear independence and span of the columns of a matrix 
simply by considering the location of pivots in the matrix. In particular, the 
columns of A are linearly independent if and only if A has a pivot in every 
column, while the columns of A span R” if and only if A has a pivot in every 
row. We state these results formally in the two following theorems. 


Theorem 1.6.1 Let A be an m x n matrix. The following statements are 
equivalent: 


a. The columns of A span R™. 
b. A has a pivot position in every row. 
c. The equation Ax = b is consistent for every b € R”. 


In the next theorem, note particularly the change in emphasis in state- 
ment (b) from rows to columns when considering pivot positions in the matrix. 


Theorem 1.6.2. Let A be an m x n matrix. The following statements are 
equivalent: 


a. The columns of A are linearly independent. 
b. A has a pivot position in every column. 


c. The equation Ax = 0 has only the trivial solution. 
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At this point, it appears ideal if a set is linearly independent or spans R”™. 
The best scenario, then, is the case when a set has both of these properties 
and forms a linearly independent spanning set. In this case, for the matrix 
whose columns are the vectors in the set, we need the matrix to have a 
pivot in every column, as well as in every row. As we saw in example 1.6.3 
with the set S and the corresponding matrix A, this can only happen when 
the number of vectors in the set S matches the number of entries in each 
vector. In other words, the corresponding matrix A must be square. Obviously 
if a square matrix has a pivot in every row, it must also have a pivot 
in every column, and vice versa. We close our current discussion with an 
important result that links the concepts of linear independence and span in 
the columns of a square matrix; theorem 1.6.3 is a consequence of the two 
preceding ones. 


Theorem 1.6.3 Let A be an n x n matrix. The following statements are 
equivalent: 


a. The columns of A are linearly independent. 
b. The columns of A span R”. 

c. A has a pivot position in every column. 

d. A has a pivot position in every row. 


e. For each b € R”, the equation Ax = b has a unique solution. 


Theorem 1.6.3 shows that square matrices play a particularly important role in 
linear algebra, an idea that will further demonstrate itself when we study the 
notion of the inverse of a matrix in the following section. 

We conclude this section with a look ahead to our study of linear differential 
equations, in which the concepts of linear independence and span will also find 
a prominent role. 


Example 1.6.4 Consider the differential equation y” + y = 0. Explain why the 
function y = c, cost + ~ sin t is a solution to the differential equation. 


Solution. In our upcoming study of differential equations, we will call the 
equation y” + y = 0a linear second-order homogeneous equation with constant 
coefficients. Equations of this form will be considered in chapter 3 and be the 
focus of chapter 4. 

For now, we can intuitively understand why y = c; cost + ~sint is a 
solution to the equation. Note that in order to solve the equation y” + y = 0, we 
must find all functions y such that y” = —y. From our experience in calculus, 
we know that 


d d 
Aen t]=cost and aloes t]=—sint 
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Furthermore, if we consider second derivatives, 
“fin ] es t] int a t] 7 int] t 
—~[sin t] = —[cost] = —sint and —~[cost] = —[—sint] = —cos 
dt? dt dt? dt 


Hence, the second derivative of each basic trigonometric function is the opposite 
of itself, which makes both y = cost and y = sint solutions to the equation 
// 

Moreover, it is a straightforward exercise to show (using properties of the 
derivative) that any scalar multiple (such as y = 3sint) of either function is 
also a solution to the differential equation, as is any combination of the form 
y =2cost+3sint. More generally, this makes any function 


y=ccost+csint 


a solution to the differential equation. 


If we think about our understanding of linear independence for a set of two 
vectors, we find an analogy to the two functions cost and sint: since these 
two functions are not scalar multiples of one another, it makes sense to call 
these functions linearly independent. Moreover, from the form of the function 
y=c, cost+ sin t, we are taking linear combinations of the basic trigonometric 
functions to form other solutions to the differential equation. We can even go 
so far as to say that the solution set to the differential equation is the span of the 
two functions cost and sin tf. 

In future work, we will see that this broader perspective on linear 
independence and span serves us well in solving linear differential equations. 
We will gain additional understanding of why the solution set to every second- 
order linear homogeneous differential equation with constant coefficients 
demonstrates a similar structure in subsequent work. 


Exercises 1.6 In each of exercises 1-8, determine whether the given set S is 
linearly independent or linearly dependent. 
1. S= {v1, v2} where v; = [3 — 2] andv2 =[—9 6]! 
2. S= {v1, v2} where v; =[1 0]! and v2 =[0 1]? 
3. S= {v,, v2} where v; =[5 —2]! andv, =[5 2]! 
4. S= {v1, v2, v3} where v; =[5 —2]!,v2 =[5 2]', andv3=[11 —5]" 
5. S= {vj}, v2, v3} where v} =[—1 2 1]’,v2 =[3 1 1]', andv3=[1 5 3]! 
6. S= {v,, v2, v3} where v} =[—1 2 1]',v. =[3 1 1]', andv3=[1 5 2]! 
7. S= {v1, v2} where v; =[1 —2 4 3]' andv) =[-3 6 —12 —9]" 
8. S= {vj}, v2, V3, V4} where v} =[—1 2 1]',v2 =[3 1 1]',v3=[1 5 2]!, 
and v4=[1 1 1]" 
9. For each of the sets S in exercises 1-8, determine whether or not S 
spans R™, where m is chosen appropriately. 


10. 


1 


a 


12. 


13. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


2 


— 


22. 


23. 
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Suppose that S is a set of three vectors in R°. Is it possible for S to span R? 
Why or why not? 


. Suppose that S is a set of two vectors in R>. Is S linearly independent, 


linearly dependent, or not necessarily either? Explain your answer. 


Let S be a set of four vectors in R*. Is it possible for S to be linearly 
independent? Is it possible for S to span R*? Why or why not? 


Let S be a set of five vectors in R*. Must S span R*? Is it possible for S to be 
linearly independent? Explain. 


If Ais an m x n matrix, for what relationship between n and m are the 
columns of A guaranteed to not span R”? For what relationship between 
nand m will the columns have to be linearly dependent? 


Prove that any set that contains the zero vector must be linearly dependent. 


Explain why any set consisting of a single nonzero vector must be linearly 
independent. 


Show that any set of two vectors, {v1, v2}, is linearly independent if and 
only if v; is not a scalar multiple of v2. 


Explain why the columns of a matrix A are linearly independent if and 
only if the equation Ax = 0 has only the trivial solution. 


Let v; =[—1 2 1]',v2=[3 1 1]!, and v3 =[5 3 k]". For what value(s) 
of k is {vi, V2, v3} linearly independent? For what value(s) of k is v3 in the 
span of {v,, V2}? How are these two questions related? 


Consider the set S = {v, v2, v3} where vj = [1 0 O}', vo =[0 1 OJ, and 
v3 =[0 0 1]'. Explain why S spans R?, and also why S is linearly 
independent. In addition, determine the weights x1, x2, and x3 that allow 
you to write the vector [—27 13 91]' asa (unique) linear combination of 
V1, V2, V3. What do you observe? 


. Let A be a 4 x 7 matrix. Suppose that when solving the homogeneous 


equation Ax = 0 there are three free variables present. Do the columns of 
A span R*? Explain. Are the columns of A linearly dependent, linearly 
independent, or is it impossible to say? Justify your answer. 


Suppose that A is a 9 x 6 matrix and that A has six pivot columns. Are the 

columns of A linearly dependent, linearly independent, or is it impossible 

to say? Do the columns of A span R’, or is it impossible to tell? Justify your 
answers. 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If the system represented by Ax = 0 has a free variable present, then 
the columns of the matrix A are linearly independent vectors. 
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(b) Ifa matrix has more columns than rows, then the columns of the 
matrix must be linearly dependent. 

(c) Ifan m x n matrix A has a pivot in every column, then the columns of 
A span R”. 

(d) If Ais an m x n matrix that is not square, it is possible for its columns 
to be both linearly independent and span R”™. 

24. Consider the linear second-order homogeneous differential equation 

y” + y = 0. Show by direct substitution that y; = e' and y) = e~ are 

solutions to the differential equation. In addition, show by substitution 

that any linear combination y = cje' + me~' is also a solution. 


25. We have seen that the general solution to the linear second-order 
differential equation y” + y = 0 is given by 


y(t) = cq sin(t) + @ cos(t) 
Suppose we know initial values for y(0) and y’(0) to be 
y(0) =4 and y’(0) = —2 


What are the values of c, and ~? How is a system of linear equations 
involved? 


26. It can be shown that the solution to the linear second-order differential 
equation y” — y = 0 is given by 
y(t)=cje'+qe' 
Suppose we know initial values for y(0) and y’(0) to be 
y(0) =4 and y’(0) = —2 


What are the values of c; and ~? How is a system of linear equations 
involved? 


1.7 Matrix algebra 


For a given system of linear equations, we are now interested in solving the 
vector equation Ax = b, where A is a known m x n matrix, b € R"™ is given, 
and we seek x € R”. It is natural to compare this equation to an elementary 
linear equation such as 2x = 7. The key algebraic step in solving 2x = 7 is 
to divide both sides of the equation by 2. Said differently, we multiply both 
sides by the multiplicative inverse of the number 2. In anticipation of a new 
approach to solving the vector equation Ax = b, we carefully state the details 
required to solve 2x = 7. In particular, from the equation 2x = 7, it follows that 
5 (2x) = $(7); so that GG -2)x= f. Thus, 1-x= i, sox= a From a sophisticated 
perspective, to solve the equation 2x = 7, we need to be able to multiply, to have 
a multiplicative identity (that is, the number 1), and to be able to compute a 
multiplicative inverse (here, the number 5). 
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In this section, we lay the foundation for similar ideas that provide an 
alternate way to solve the equation Ax = b: essentially we are interested in 
determining whether we can find a matrix B so that when we compute BA the 
result is the matrix equivalent of “1”. To do this, we will first have to learn what 
it means to multiply two matrices; a simpler (and still important) place to begin 
is with the addition of matrices and multiplication of matrices by scalars. 

We already know how to add vectors and multiply them by scalars; similar 
principles hold for matrices. Two matrices can be added (or subtracted) if and 
only if they have an identical number of rows and columns. When addition 
(subtraction) is defined, the result is computed component-wise. Furthermore, 
the multiple of a matrix by a scalar c € R is attained by multiplying every entry 
of the matrix by the same constant c. The following example demonstrates these 
basic facts. 


Example 1.7.1 Let A and B be the matrices 
1 3 —4 —6 10 —-l 
a=o -7 Al B=| a 2 i 
Compute A+ B and —3A. 


Solution. Since A and B are both 2 x 3, their sum is defined and is given by 
1 3 —4 —6 10 —-1 —5 13 —-5 
a+B=|o —7 2|+[ a 2 all 2° 9 2 
The scalar multiple of a matrix is always defined, and —3A is given by 


—3 -9 12 
-3a=| 0 21 = 


Matrix addition, when defined, has all of the expected properties of addition. 
In particular, A+ B = B+, so order does not matter, and we say matrix 
addition is commutative. Since A+ (B+ C) = (A+B) +C, the way we group 
more than two matrices to add also does not matter and we say matrix addition 
is associative. There is even a matrix that acts like the number 0. If Z is a matrix 
of the same number of rows and columns as A such that every entry in Z is zero, 
then it follows that A+ Z= Z+A=A. We call this zero matrix the additive 
identity. 

The next natural operation to consider, of course, is multiplication. What 
does it mean to multiply two matrices? And when does it even make sense 
to multiply two matrices? We know for matrix—vector multiplication that the 
product Ax computes the vector b that is the unique linear combination of 
the columns of A having the entries of the vector x as weights. Moreover, this 
product is only defined when the number of entries in x matches the number of 
columns of A. If we now consider a matrix B, we can naturally think about the 
matrix product AB by considering the columns of B, say bj, ..., bg. In particular, 
we make the following definition. 
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Definition 1.7.1 If A is an m x n matrix, and B is a matrix whose columns 
are b;,...,bx such that the matrix—vector product Ab; is defined for each 
j=1.,...,k, then we define the matrix product AB by 


AB = [Ab; Aby --- Ab;] (1.7.1) 


Note particularly that since A has n columns, in order for Ab; to be defined 
each b; must belong to R”. This in turn implies that the matrix B must have 
dimensions n x k. Specifically, the number of rows in B must equal the number 
of columns in A. We explore matrix multiplication and its properties in the next 
example. 


Example 1.7.2 Let A and B be the matrices 
1 3 -4 —6 10 
a=[o 7 a] B=[5 | 
Compute the matrix products AB and BA, or explain why they are not defined. 


Solution. First we consider AB. To do so, we would have to compute both Ab; 
and Abz, where b; and by are the columns of B. But neither of these products is 
defined, since A has three columns and B has just two rows. Thus, AB is not defined. 

On the other hand, BA is defined. For instance, we can compute the first 
column of BA by taking Ba;, where we see that 


—6 10 1 —6 
m= 2 Mo) =[3] 
Similar computations for Baz and Ba; show that 


—6 —88 44 
Ba =| 3-5 S| 


There are several important observations to make based on example 1.7.2. One 
is that if A is m x n and B is n x k so that the product AB is defined, then the 
resulting matrix AB is m x k. This is true since the columns of AB are each of 
the form Ab;, thus being linear combinations of the columns of A, which have 
m entries, so that AB has m rows. Moreover, we have to consider each of the 
products Ab,,..., Ab;, therefore giving AB k columns. 

Furthermore, we clearly see that order matters in matrix multiplication. 
Specifically, given matrices A and B for which AB is defined, it is not even 
guaranteed that BA is defined, much less that AB = BA. Even when both 
products are defined, it is possible (even typical) that AB # BA. Formally, we 
say that matrix multiplication is not commutative. This fact will be explored 
further in the exercises. It is, however, the case that matrix multiplication (for 
matrices of the appropriate sizes) is both associative and distributive. That is, 
A(BC) = (AB)C and A(B+ C) = AB + AC, again provided the sizes of the 
matrices make the relevant products and sums defined. 
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Now, we should not forget our motivation for considering matrix multi- 
plication: we want to develop an alternative approach to solving equations of 
the form Ax = b by multiplying A by another matrix B so that the product BA 
is the matrix equivalent of the number 1 (while simultaneously multiplying b by 
the same matrix B). What is the matrix equivalent of the number 1? We consider 
this question and more in the following example. 


Example 1.7.3. Consider the matrices 


5 ll 1 0 
a=| 3 =] and b=[j | 


Compute AI, and I,A. What is special about the matrix Ip? 


Solution. Using the rules for matrix multiplication, we observe that 


5 11]f1 0 5 1 
an=|_3 Z| E ee UlFA 
10|]f 5 1 5 ll 
ma=[p i] “a]=[-3 ]-4 


Thus, we see that multiplying the matrix A by I, has no effect on the matrix A. 


and similarly 


The matrix I, in example 1.7.3 is important because it has the property 
that ILA=A for any matrix A with two rows (not simply the matrix A in 
example 1.7.3) and Aly = A for any A with two columns. We can similarly show 
that if I; is the matrix 
10 0 
I,=/]0 1 0 
00 1 
then I,A = A for any matrix A with three rows, and AI; = A for any matrix A 
with three columns. Similar results hold for corresponding matrices I, of larger 
size; each of these matrices acts like the number 1, since multiplying other 
matrices by I, has no effect on the given matrix. 

Matrices which when multiplied by other matrices do not change the other 
matrices, are called identity matrices. More formally, the n x n identity matrix I, 
is the square matrix whose diagonal entries all equal 1, and whose off-diagonal 
entries are all 0. (The diagonal entries in a matrix are those whose row and 
column indices are the same.) Often, when the context is clear, we will write 
simply I, rather than I,,. We also note that I, is the only matrix that is n x n and 
acts as a multiplicative identity. Finally, it is evident that for any m x n matrix 
A, I,A = AI, = A. In the next section, we will explore the notion of the inverse 
of a matrix, and there see that identity matrices play a central role. 

One final algebraic operation with matrices merits formal introduction 
here. Given a matrix A, its transpose, denoted A!, is the matrix whose columns 


62 Essentials of linear algebra 


are the rows of A. That is, taking the transpose of a matrix replaces its rows with 
its columns, and vice versa. For example, if A is the 2 x 3 matrix 


1 3 -4 
a=|q ~7 


then its transpose A! is the 3 x 2 matrix 


1 0 
A'=| 3 -7 
4 2 


Note that this is the same notation we regularly use to express a column vector in 
the form b =[1 2 3]". In the case that A is a square matrix, taking its transpose 
results in swapping entries across its diagonal. For example, if 


a, 77 
KS) 08 - =I 
—4 §§ —6 

then 
5 0-4 
A'=|-2 -3 8 
7 =A 6 


The transpose operator has several nice algebraic properties, some of which will 
be explored in the exercises. For example, for matrices for which the appropriate 
sums and products are defined, 


(A+B)'=A'+B! 
and 
(AB)! =B!A! 


For a square matrix such as 


Ly 


it happens that A' = A. Any square matrix A for which A? = A is said to 
be symmetric. It turns out that symmetric matrices have several especially nice 
properties in the context of more sophisticated concepts that arise later in the 
text, and we will revisit them at that time. 


1.7.1 Matrix algebra using Maple 


While it is important that we first learn to add and multiply matrices by hand 
to understand how these processes work, just like with row-reduction it is 
reasonable to expect that we will often use available technology to perform 
tedious computations like multiplying a 4 x 5 and 5 x 7 matrix. Moreover, in 
real-world applications, it is not uncommon to have to deal with matrices that 
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have thousands of rows and thousands of columns, or more. Here we introduce 
a few Maple commands that are useful in performing some of the algebraic 
manipulations we have studied in this section. 

Let us consider some of the matrices defined in earlier examples: 


1 3 —4 —6 10 —6 10 -1 
a=|o —7 Al B=|~$ 5 c=|"§ 2 al 
After defining each of these three matrices with the usual commands in Maple, 
such as 


> A := <<1,0>|<3,-7>|<-4,2>>; 


we can execute the sum of A and C and the scalar multiple —3B with the 
commands 


> A+ C; 
2) =3* BF 


for which Maple will report the outputs 
=) 1375 ane 18 —30 
3 —5 13 —-9 —6 
We have previously seen that to compute a matrix-vector product, the period 
is used to indicate multiplication, as in > A.x;. The same syntax holds for 


matrix multiplication, where defined. For example, if we wish to compute the 
product BA, we enter 


> B.A; 
which yields the output 
—6 —88 44 
3 —5 —8 
If we try to have Maple compute an undefined product, such as AB through 
the command > A.B;, we get the error message 


Error, (in LinearAlgebra:-MatrixMatrixMultiply) 
first matrix column dimension (3) <> second matrix 
row dimension (2) 


In the event that we need to execute computations involving an identity 
matrix, rather than tediously enter all the 1’s and 0’s, we can use the built-in 
Maple command Ident ityMatrix (n) ; where nis the number of rows and 
columns in the matrix. For example, entering 


> Id := IdentityMatrix(A4); 
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FE » o| 
a/ 2 28 
Ceae 


Note: Id is the name we are using to store this identity matrix. We cannot use 
the letter I because I is reserved to represent ./—1 in Maple . 
Finally, if we desire to compute the transpose of a matrix A, such as 


1 3 -4 
a=|q ~7 | 


results in the output 


or O&O 


the relevant command is 
> Transpose (A); 


which generates the output 


1 0 
A'=| 3 -7 
=i 3 


Exercises 1.7 


1. Let A, B, and C be the given matrices. In each of the following problems, 
compute (by hand) the prescribed algebraic combination of A, B, and C if 
the operation is defined. If the operation is not defined, explain why. 


—6 10 5 3 
a=[_3 - i} B=] 2 11], C=/-1 0 
3 29 a 
(a)B+C (b)A+B (c) —2A (d) -3B+4C_ (e) AB 
(f) BA (g) AA (h) A(B+C) (i)CA (j) C(A+B) 
(k)A'T+B_ (1)(B+C)' (m)B'C (n) BCT (0) (AB)? 
(p) (BA) 


2. Let A, B, and C be the given matrices. In each of the following problems, 
compute (by hand) the prescribed algebraic combination of A, B, and C 
whenever the operation is defined. If the operation is not defined, explain 


why. 
5.3 2 11 1 0 
a=| 2 ak B=| 5 sh oe) | 
oe (b)A+B — (c)—2A (d)-3B+4C (e) AB 
(f) B (g) AA (h) A(B+C) (i) CA (j) C(A+B) 
WAHD (1) (B+C)"  (m) BTC (n) BCT (0) (AB)" 
(p) (BA)? 
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ies) 


. Discuss the differences between multiplying two square matrices versus 
multiplying non-square matrices. That is, under what circumstances can 
two square matrices be multiplied? How does the situation change for 
non-square matrices? In addition, if the product AB is defined, is BA? 


4, Give an example of 2 x 2 matrices A and B for which AB # BA. 
5. Give an example of 2 x 2 matrices A and B for which AB = BA. 


6. If Ais m x nand B is n x k, and neither A nor B is square, can AB ever 
equal BA? Explain. 


In exercises 7—9, let A be the given matrix. If possible, find a matrix B such that 
BA=L,; if B exists, determine whether BA = AB. 


2 0 
anki 

24 
sel? 

1 -1 
2an[} 2] 


In exercises 10 and 11, for the given matrix A, answer each of the following 
questions: 


(a) Are the columns of A linearly independent? 
(b) Do the columns of A span R?? 
(c) How many pivot positions does A have? 

) 


(d) Solve the equation Ax = 0 by row reducing by hand. Is A row equivalent 
to an important matrix? 


(e) If possible, determine a 2 x 2 matrix B such that BA= 1. 
-—l 2 
wae(t 3 


—1 2 
anf 2 


12. Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If A and B are matrices of the same size, then the products AB and BA 
are always defined. 

(b) If A and B are matrices such that the products AB and BA are both 
defined, then AB = BA. 

(c) If A and B are matrices such that AB is defined, then (AB)! = A'B!. 
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(d) If A and B are matrices such that A+ B is defined, then 
(A+B)'=A'+B!, 


13. Compute the prescribed algebraic computations in exercise 1 using a 
computer algebra system. 


14. Compute the prescribed algebraic computations in exercise 2 using a 
computer algebra system. 


1.8 The inverse of a matrix 


We have observed repeatedly that linear algebra is a subject centered on one 
idea—systems of linear equations—viewed from several different perspectives. 
Continuing with this theme, we have recently considered an alternative method 
for solving the equation Ax = b by attempting to find a matrix B such that 
BA =I, where I is the appropriate identity matrix. If we can in fact find such a 
matrix B, it follows that 


B(Ax) = Bb (1.8.1) 


By the associativity of matrix multiplication and the defining property of B, it 
follows that 


B(Ax) = (BA)x = Ix =x (1.8.2) 


Equations (1.8.1) and (1.8.2) together imply that x = Bb. Thus, the existence of 
such a matrix B shows us how we can solve Ax = b by multiplication. It turns out 
that from a computational point of view, row-reduction is a superior approach 
to solving Ax = b; nonetheless, the perspective that it may be possible to solve 
the equation through the use of a multiplicative inverse has many important 
theoretical applications. In addition, similar ideas will be encountered in our 
study of differential equations. 

Our work in section 1.7 showed that if A and B are not square matrices, 
it is never the case that AB and BA are equal. Thus it is only possible to find a 
matrix B such that AB = BA = [if A is square (though even then it is not always 
the case that such a matrix B exists). Moreover, as we know from theorem 1.6.3, 
some square matrices have the important property that the equation Ax = b has 
a unique solution for every possible choice of b. 

For the next few sections, we therefore focus our attention almost exclusively 
on square matrices. Here, our emphasis is on the questions “when does a matrix 
B exist such that AB = BA = I?” and “when such a matrix B exists, how can we 
find it?” The next definition formalizes the notion of the inverse of a matrix. 


Definition 1.8.1 If Ais an n x n matrix, we say that A is invertible if and only 
if there exists an n x n matrix B such that 


AB=BA=I,, (1.8.3) 
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When A is invertible, we call B the inverse of A and write B= A7! (read “B is 
A-inverse”). If A is not invertible, A is often called a singular matrix, and thus 
saying “A is invertible” is equivalent to saying “A is nonsingular.” 

It can be shown (see exercise 19) that if A is an invertible n x n matrix, 
then its inverse is unique (i.e., a given matrix cannot have two distinct inverses). 
In addition, we note from our discussion above in (1.8.1) and (1.8.2) that if 
A is invertible, then the equation Ax = b has a solution for every b € R”. In 
particular, that solution is x = A~'b. Moreover, since Ax = b has a solution 
for every b € R", we know from theorem 1.6.1 that A has a pivot position in 
every row. From this, the fact that A is square, and theorem 1.6.3, it follows that 
Ax = b has a unique solution for every b € R”. We state this result formally in 
the following theorem. 


Theorem 1.8.1 If A is an n x n invertible matrix, then the equation Ax = b 
has a unique solution for every b € R”. 


Before beginning to explore how to find the inverse of a matrix, as well as when 
the inverse even exists, we consider an example to see how we may check if two 
matrices are inverses and how to apply an inverse to solve a related equation. 


Example 1.8.1 Let A and B be the matrices 
14 5 _ | 2/3 —5/3 
= k i. = Le i, 
Show that A and B are inverses, and then use this fact to solve Ax = b, where 
b = [—7 3]', without using row reduction. 


Solution. The reader should verify that the following matrix products indeed 
hold: 
{4°55 2/3 —5/3|__|1 0 
a I! Ey We ~ lo | 


_f 2/3 -5/3][4 5]_f1 0 
pa=|_{!5 | I =e | 
This shows that indeed B = A~!. Note, equivalently, that A = B~!. Now, we can 
easily solve the equation Ax = b where b is the given vector: 


: 2/3 -5/3][-7] _ [29/3 
as bel 7, al A el 


Of course, what is not clear in example 1.8.1 is how, given the matrix A, one 
might determine the entries in the inverse matrix B = A~!. We now explore this 
in the 3 x 3 case for a general matrix A, and along the way learn conditions that 
guarantee that A! exists. 
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Given a 3 x 3 matrix A, we seek a matrix B such that AB = I3. Let the 
columns of B be bj, b2, and b3, and the columns of I; be e1, e2, and e3. The 
column-wise definition of matrix multiplication then tells us that the following 
three vector equations must hold: 


Ab; =e;, Ab) =e), and Ab3 =e; (1.8.4) 


For the unique inverse matrix B to exist, it follows that each of these equations 
must have a unique solution. Clearly if A has a pivot position in every row (or, 
equivalently, the columns of A span R?), then by theorem 1.6.3 it follows that 
we can find unique vectors b;, bz, and b3 that make these three equations hold. 
Thus, any one of the conditions in theorem 1.6.3 will guarantee that B = A7! 
exists. Moreover, if A~! exists, we know from theorem 1.8.1 that every condition 
in theorem 1.6.3 also holds. 

Momentarily, let us assume that A is indeed invertible. If we proceed to find 
the matrix B by solving the three equations in (1.8.4), we see that row-reduction 
provides an approach for producing all three vectors at once. To find these 
vectors one at a time, it would be necessary to row-reduce each of the three 
augmented matrices 


[Ae], [Ae], and [Aes] (1.8.5) 


In each case, the exact same elementary row operations will be applied to A and 
thus be applied, respectively, to the vectors e;, e2, and e3. As such, we may do 
all of them at once by considering the augmented matrix 


[A e] e2 €3] (1.8.6) 


Note particularly that the form of the augmented matrix in (1.8.6) is [A Is]. If 
we now row-reduce this matrix, and A has a pivot in every row, it follows that 
we will be able to read the coefficients of A~! from the result. This process is best 
illuminated by an example, so we now explore how these computations lead us 
to A7! ina concrete situation. 


Example 1.8.2 Find the inverse of the matrix 


2 1 -—2 
A= 1 1 -l 
—2 -l1 3 


Solution. Following the discussion above, we augment A with the 3 x 3 
identity matrix and row-reduce. It follows that 


2 1-2 10 0 100 2-1 1 
1 1-1 01 0;—~>/;0 10-1 2 0 
—2 -1 3001 001 1 O01 


These computations demonstrate two important things. The first is that the 
row reduction of A in the first three columns of the augmented matrix shows 
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that A has a pivot position in every row, and therefore A is invertible. Moreover, 
the row-reduced form of [A Is] tells us that A~! is the matrix 


o: al, i 
A'=|]-1 2 0 
1 ol 


Again, we observe from our preceding discussion and example 1.8.2 that we have 
found an algorithm for finding the inverse of a square matrix A. We augment A 
with the corresponding identity matrix and row-reduce. Provided that A has a 
pivot in every row, we find by row-reducing that 
[AI] > [1A] 

That is, row-reduction of an invertible matrix A augmented with the identity 
matrix leads us directly to the inverse, A~ hy 

Next, we examine what happens in the event that a square matrix is not 
invertible. 


Example 1.8.3 Find the inverse of the matrix 


we 2 


provided the inverse exists. If the inverse does not exist, explain why. 


Solution. We augment A with the 2 x 2 identity matrix and row-reduce, 


finding that 
2° te] ot 70 -% 
-6 -3 0 1 001 i 
Again, we see at least two key facts from these computations: A does not have 
a pivot position in every row, and thus A is not invertible. In particular, recall 


that we are solving two vector equations simultaneously in these computations: 
Ab, = e; and Ab» = ep. If we consider the first of these and observe the row- 


reduction 
2 11) ,f1 5 0 
-—6 -—3 0 0 01 


we see that this system of equations is inconsistent—the last row of the 
augmented matrix is equivalent to the equation 0b); + 0b12 = 1, where b = 
[bi. by]. This is yet another way of saying that A does not have an inverse. 


The above two examples together show us, in general, how we answer two 
questions at once: does the square matrix A have an inverse? And if so, what is 
A~!? In a computational sense, we can simply row-reduce A augmented with 
the appropriate identity matrix and then observe if A has a pivot position in 
every row. If A is row equivalent to the appropriately sized identity matrix, then 
A is invertible and A~! will be revealed through the row-reduction. 
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We close this section with a formal statement of a theorem that summarizes 
our discussion. Note particularly how this result extends theorem 1.6.3 and 
demonstrates the theme of linear algebra: one idea from several perspectives. 
We will refer to this result as The Invertible Matrix Theorem. 


Theorem 1.8.2. (The Invertible Matrix Theorem) Let A be an n x n matrix. 
The following statements are equivalent: 


a. A is invertible. 

b. The columns of A are linearly independent. 
c. The columns of A span R”. 

d. A has a pivot position in every column. 

e. A has a pivot position in every row. 

f. A is row equivalent to I,. 


g. For each b € R”, the equation Ax = b has a unique solution. 


In addition to being of great theoretical significance, inverse matrices 
find many key applications. We investigate one such use in the following 
subsection. 


1.8.1 Computer graphics 


Linear algebra is the engine that drives computer animations. While animated 
movies originally were constructed by artists hand-drawing thousands of similar 
sketches that were photographed and played in sequence, today such films 
are created entirely with computers. Once a figure has been constructed, 
moving the image around the screen is essentially an exercise in matrix 
multiplication. 

Every pixel in an image on a computer screen can be represented through 
coordinates. For an elementary example, consider an animated figure which, at 
a given point in time, has its hand located at the point (3, 4). To see how a basic 
animation can be built, assume further that the figure’s elbow is at the origin 
(0, 0), and that an animator wishes to make the hand wave back and forth. This 
enables us to represent the forearm of the figure with the vector v = [3 4]'. 

If we now consider the matrix 


fm VI2. =172 
Hh Te «fap? 


and apply the matrix R to the vector v, we see that the product is 


pv | ¥3/2 -1/2] [3] _[3v3-4/2] ~ [0.598 
| 1/2 «/3/2|| 4)" | 3444/3/2|~ | 4.964 
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Figure 1.13 The vectors 
v=[3 4] and Rv = 
[0.598 4.964]". 


Thus, the figure’s hand is now located at the point (0.598, 4.964). In fact, 
the hand has been rotated 30° counterclockwise about the origin, as shown 
in figure 1.13. 

The matrix R is known as a rotation matrix; its impact on any vector is to 
rotate the vector 30° counterclockwise about the origin. One way to see why this 
is so is to compute the vectors Re; and Rez, where e; and e2 are the columns 
of the 2 x 2 identity matrix. Since each of those two vectors is rotated 30° when 
multiplied by R, the same thing happens to any vector in R?, because any such 
vector may be written as a linear combination of e; and eo. 

Not only do computer animations show one application of matrix—vector 
multiplication, but they also demonstrate the need for inverse matrices. For 
instance, suppose we knew that the matrix R had been applied to some unknown 
vector v and that the result was 

2 
Rv= | 


That is, a hand located at some unknown point v was waved and had been moved 
to the new point (2,5). An animator might want to wave the hand back so that 
it ended up at its original location, which is again represented by the vector v. 
To do so, he must answer the question “for which vector v is Rv = [2 5] 12” 

We now know that one way to solve for v is to use the inverse of R. The 
matrix R is clearly invertible because its columns are linearly independent; we 
can compute R7! in the standard way to find that 


R= hee | 
TNs 2-4/3 72 
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We can solve for v by computing 


aI 


ciel EI o ee "| H a eal 
= 5) 1d 4/3723 3.330 
Of course, in actual animations, we would not wave the hand by a single 30° 
rotation, but rather through a sequence of consecutive small rotations, for 
instance, 1-degree rotations. Again, computers enable us to do thousands of 
such computations almost instantly and make amazing animations possible. 


We consider an additional example to see the role of matrices to store data 
as well as matrices and their inverses to transform the data. 


v=R!(Rv)=R! 5 | 


so that 


Example 1.8.4 Consider the matrix 


ml o 


Let v; = [2 1]!, v. =[3 3]', and v3 =[4 0]! be the vertices of a triangle in 
the plane. Compute By), Bv2, and Bv3. Sketch a picture of the new triangle that 
has resulted from applying the matrix B to the vertices (2, 1), (3,3), and (4, 0). 
What is the impact of the matrix B on each point? Finally, determine the inverse 
of B. What do you observe? 


Solution. We observe first that 


m[EI-E) 6 II-B 
mf AEE 


From these calculations, we see that multiplying by B moves a given point to a 
new point that corresponds to the one found by switching the coordinates of 
the given point. Geometrically, the matrix B accomplishes a reflection across 
the line y = x in the plane, as we can see in figure 1.14. 


Moreover, if we think about how we might undo reflection across the line y = x, 
it is clear that to restore a point to its original location, we need to reflect the 
point back across the line. Said differently, the inverse of the matrix B must be 
the matrix itself. We can confirm that B~' = B by computing the product 


mf Jl J+ 


It is noteworthy that the calculations of Bv;, Bv2, and Bv3 can be simplified into 
a single matrix product if we let T = [v1 v2 v3]. That is, the matrix T holds the 
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Figure 1.14 The triangle with 
vertices v; =[2 1]', vw =[3 3], 
and v3 =[4 0]! and its image under 
multiplication by the matrix B. 


coordinates of the three points in the given triangle; the product BT is then the 
image of the triangle under multiplication by the matrix B. A more complicated 
polygonal figure than a triangle would be stored in a matrix with additional 
columns. 

Of course, the actual work of computer animations is much more com- 
plicated than what we have presented here. Nonetheless, matrix multiplication 
is the platform on which the entire enterprise of animated films is built. In 
addition to achieving rotations and reflections, matrices can be used to dilate 
(or magnify) images, to shear images, and even to translate them (provided that 
we are clever about the coordinate system we use to represent points). Finally, 
matrices are even essential to the storage of images, as each column of a matrix 
can be viewed as a data point in an image. More about the application of matrices 
and their inverses to computer graphics can be learned in one of the projects 
found at the end of this chapter. In addition, a deeper discussion of the notion 
of linear transformations (of which reflection and rotation matrices are a part) 
can be found in appendix D. 


1.8.2 Matrix inverses using Maple 


Certainly we can use Maple’s row-reduction commands to find inverses of 
matrices. However, an even simpler command exists that enables us to avoid 
having to enter the corresponding identity matrix. Let us consider the two 
matrices from examples 1.8.2 and 1.8.3. Let 


2 1 -—2 
A= 1 1 -l 
—2 -1 3 


If we enter the command 


> MatrixInverse (A); 
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we see the resulting output which is indeed A~!, 


2 =—1 1 
-1l 20 
1 O11 


For the matrix 


[6] 


executing the command > MatrixInverse (A) ; produces the output 


Error, (in LinearAlgebra:-LA_Main:-MatrixInverse) 
Singular matrix 


which is Maple’s way of saying “A is not invertible.” 


Exercises 1.8 In exercises 1-5, find the inverse of each matrix (doing the 
computations by hand), or show that the inverse does not exist. 


12 -1 
4./0 1 3 
00 2 

1 —2 -1 

5. | —l 1 0O 

1 3 4 


1 3 —3 2 11 : 
6. Let =| 4] andbs =| 3 b=|_3].m=[ 1 | ind a-t and 


use it to solve the equations Ax = b;, Ax = bz, and Ax = b3. In addition, 
show how you can use row reduction to solve all three of these equations 
simultaneously. 


1 -—3 10 —1/2 2 
7.teta=|_ g|andbi =| _ 39 f be=| (} b= |} some the 


equations Ax = b;, Ax = bz, and Ax = b3. What do you observe about the 
matrix A? 

1 -—2 
1 2 
why b may be written as a linear combination of the columns of A. 


8. Let A= | and b= | Without doing any computations, explain 


\o 


10. 


1 


— 


12. 


13: 


14, 


15. 


16. 


17. 


18. 


19. 


1 0 
. Let E be the elementary matrix given byE= | 0 0 
1 
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Then execute computations to find the explicit weights by which b is a 
linear combination of the columns of A. 
0 
1 |. Note that E is 

0 0 
obtained by interchanging rows 2 and 3 of the 3 x 3 identity matrix. 
Choose a 3 x 3 matrix A, and compute EA. What is the effect on A of 
multiplication by E? 
Without doing any row-reduction, determine E~! where E is the matrix 
defined in exercise 9. (Hint: E~'EI = I. Think about the impact that E has 
on I, and then what E~! must accomplish.) 


100 
. Let E be the elementary matrix given byE=|]0 c 0]. Note that E is 
00 1 


obtained by scaling the second row of the 3 x 3 identity matrix by the 
constant c. Choose a 3 x 3 matrix A, and compute EA. What is the effect 
on A of multiplication by E? 
Without doing any row reduction, determine E~! where E is the matrix 
defined in exercise 11. What do you observe? 
100 
Let E be the elementary matrix givenbyE=|0 1 0 
a0l 
obtained by applying the row operation of taking a times row 1 of the 3 x 3 
identity matrix and adding it to row 3 to form a new row 3. Choose a 3 x 3 
matrix A, and compute EA. What is the effect on A of multiplication by E? 


. Note that E is 


Without doing any row reduction, determine E~! where E is the matrix 
defined in exercise 13. (Hint: E~!EI = I. Think about the impact that E 
has on I, and then what E~! must accomplish.) 


1//2 1/2 = 
Let A= : te A~*. Wh t th 
e I; IZ 1/2 Compute at do you observe about the 


relationship between A and A~!? 

cos@ —sind 
sin@ —-cos@ 
AA. What do you observe about the relationship between A and A7? 
Let A and B be invertible 1 x n matrices with inverses A~! and B“!," 
respectively. Show that AB is also an invertible matrix by finding (AB)~! 
in terms of A7' and B71. 

Let A be an invertible matrix. Explain why A™! is also invertible, and find 
Cae ma 

Show that if A is an invertible n x n matrix, then its inverse is unique. 


(Hint: suppose that both B and C are inverses of A. What can you say 
about AB and AC?) 


Let 6 be any real number and A= | Compute A! and 
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20. 


2 


pany 


22. 


23; 


24, 


25. 


26. 


27. 
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For real numbers a and b, the Zero Product Property states that “if 
a-b=0, then a= 0 or b= 0.” Said differently, if a 40 and b 4 0, then 
a-b#0. Let 0 be the 2 x 2 zero matrix (i.e., all entries are zero). Does the 
Zero Product Property hold for matrices? That is, can you find two 
nonzero matrices A and B such that AB = 0? Can you find such matrices 
where none of the entries in A or B are zero? If so, what kind of matrices 
are A and B? 


. Does there exist a 2 x 2 matrix A, none of whose entries are zero, such that 


A? =0? 

Does there exist a 2 x 2 matrix A other than the identity matrix such that 
A? =I? What is special about such a matrix? 

Let D be a diagonal matrix, P an invertible matrix, and A= PDP™!. Using 
the expression PDP! for A, compute and simplify the matrix A? = A- A. 
Do likewise for AX = A- A- A. What will be the simplified form of A” in 
terms of P, D, and P~!? 


Let A be the matrix k Al Find conditions on a, b, c, and d that 


guarantee that Ax = 0 has infinitely many solutions. What must therefore 
be true about a, b, c, and d in order for A to be invertible? 

Lf? a/3/2 
=a/3/2 12 
the origin to the vertices of the triangle given by (2, 1), (3,3), and (4, 0). 
Compute the new triangle that results from applying the matrix A to the 
given vertices, and sketch a picture of the original triangle and the 
resulting image. What is the effect of multiplying by A? 


Let A= and vj, V2, v3 be the vectors that emanate from 


Suppose that A in exercise 25 was applied to a different set of three 
unknown vectors x), X2, and x3. The resulting output from these 


products is 
—4 0 2 
Axi =[ | an=[$]. and Axy =| | 


In other words, the new image after multiplying by A is the triangle whose 
vertices are (—4, 2), (0,3), and (2, 1). 

Determine the exact vectors x), X2, and x3 and sketch the original triangle 
that was mapped to the triangle with vertices (—4, 2), (0, 3), and (2, 1). 


0 -l 
B= ' i 
Let v; =[2 1]!, v2 =[3 3]', and v3 = [4 o]!. Compute By), Bv2, and 
Bvy3. Sketch a picture of the new triangle that has resulted from applying 


the matrix B to the vertices (1, 1), (2,3), and (4, 0). What is the geometric 
effect of the matrix B on each point? 


Consider the matrix 


28. 
29. 


30. 


31. 


32. 
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Determine the inverse of B in exercise 27. What do you observe? 

An unknown 2 x 2 matrix C is applied to the two vectors v; = [1 1]! and 
v2 = [2 3]', and the results are Cv; = [0.1 0.7]' and Cv2 = [—0.1 1.8]'. 
Determine the entries in the matrix C. 


Suppose that a computer graphics programmer decides to use the matrix 


ae ifs/2. 1fa/2 
~ L1/v2 1//2 


Why is the programmer’s choice a bad one? What will be the result of 
applying this matrix to any collection of points? 


Suppose that for a large population that stays relatively constant, people 
are classified as living in urban, suburban, or rural settings. Moreover, 
assume that the probabilities of the various possible transitions are given 
by the following table: 


Future location (J)/current location (+) | U(%) | S(%) | R(%) 


Urban 92 3 2 
Suburban 7 96 10 
Rural 3 1 88 


Given that the population of 250 million in a certain year is distributed 
among 100 million urban, 100 million suburban, and 50 million rural, 
determine the population distribution in each of the preceding 

two years. 

Car-owners can be grouped into classes based on the vehicles they own. 
A study of owners of sedans, minivans, and sport-utility vehicles shows 
that the likelihood that an owner of one of these automobiles will replace 
it with another of the same or different type is given by the table 


Future vehicle (| )/ 
current vehicle (—) | Sedan(%) | Minivan(%) | SUV(%) 


Sedan 91 3 2 
Minivan 7 95 8 
sUV 2 2 90 


If there are currently 100 000 sedans, 60 000 minivans, and 80 000 SUVs 
among the owners being studied, determine the distribution of vehicles 
among the population before each current owner replaced his or her 
previous vehicle. 
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33. Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If A is a matrix with a pivot in every row, then A is invertible. 

(b) If A is an invertible matrix, then its columns are linearly independent. 

(c) If Ax = b has a unique solution, then A is an invertible matrix. 

(d) If A and Bare invertible matrices, then (AB)~! exists and 
(AB)~! = A7!B71. 

(e) If A is a square matrix row equivalent to the identity matrix, then A is 
invertible. 

(f) If A is a square matrix and Ax = b has a solution for a given vector b, 
then Ax = c has a solution for every choice of c. 

(g) If R is a matrix that reflects points across a line through the origin, 
thenR7'=R. 

(h) If A and B are 2 x 2 matrices with all nonzero entries, then AB cannot 
equal the 2 x 2 zero matrix. 


1.9 The determinant of a matrix 


The Invertible Matrix Theorem (theorem 1.8.2) tells us that there are several 
different ways to determine whether or not a matrix is invertible, and hence 
whether or not an 1 x 1 system of linear equations has a unique solution. There 
is at least one more useful way to characterize invertibility, and that is through 
the concept of a determinant. As seen in exercise 24 of section 1.8, it may be 
shown through row-reduction that the general 2 x 2 matrix 


ea 


is invertible if and only if ad — bc £ 0. We call the quantity (ad — bc) the 
determinant of the matrix A, and write* det(A) = ad — bc. Note that this 
expression provides a condition on the entries of matrix A that determines 
whether or not A is invertible. 

We can explore similar ideas for larger matrices. For example, if we take an 
arbitrary 3 x 3 matrix 


a1 a2 a3 
A=] 2 43 
431 432 433 


and row-reduce in order to explore conditions under which the matrix has a 
pivot position in every row, it turns out to be necessary that the quantity 


D = 411 422.433 — a) €23.432 — A221 433 + 412423431 + A132) 432 — 413.2243] 


4 Some authors use the notation |A| instead of det(A). 
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is nonzero. Grouping and factoring, we see that D may be rewritten in the form 
D = aj (422433 — a23032) — a2 (a21 433 — 423.431) + 413(a21 432 — 422431) (1.9.1) 


We again call this quantity D the determinant of the matrix A. In (1.9.1) we 
see evidence of the fact that determinants of larger matrices can be defined 
recursively in terms of smaller matrices found within the original matrix A. For 
example, letting 


a2 a 
tee 22 al 
432 433 
it follows that det(Ai1) = 422433 — 423432, which is the expression multiplied by 


ai, in (1.9.1). More generally, if we let Aj; be the submatrix defined by deleting 
row i and column j of the original matrix A, then we see from (1.9.1) that 


D= ay det(Aj1) — 412 det(Aj2) + 413 det(Aj3) 


The formal definition of the determinant ofan n x n matrix is given through 
a similar recursive process. 


Definition 1.9.1 The determinant of an n x n matrix A with entries aj is 
defined to be the number given by 


det(A) = ay; det(Aq1) — ai2 det(Aqz) +--+ (—1)"" ain det(Ain) — (1.9.2) 


where Ajj is the matrix found by deleting row i and column j of A. 
We next consider an example to see some concrete computations. 


Example 1.9.1 Compute the determinant of the matrix 


2 -1 1 
A= 1 1 2 
=3 Q =3 


In addition, determine if A is invertible. 


Solution. By definition, 


2 -1 1 
det} 1 1 2 = 2aet| 5] -ceer| : 5] + t4et[ ! i 
a a O83 3 -3 —3 0 


=A=3=0) + =3=(6)) +10 =(—3)) 
=-6+34+3 
=0 
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Next, to determine whether or not A is invertible, we row-reduce A to see if A 
has a pivot position in every row. Doing so, we find that 


2 —1 1 101 
1 1 2);—>/;0 1 1 
—3 0 —3 0 0 0 


Thus, we see that A does not have a pivot in every row, and therefore A is not 
invertible. 


Of course, we should note that the primary motivation for the concept 
of the determinant comes from the question, “is A invertible?” Indeed, one 
reason the 3 x 3 matrix in the above example is not invertible is precisely 
because its determinant is zero. Later in this section, we will formally establish 
the connection between the value of the determinant and the invertibility of a 
general n x n matrix. 

It is clear at this point that determinants of most n x n matrices with 
n> 3 require a substantial number of computations. Certain matrices, however, 
have particularly simple determinants to calculate, as the following example 
demonstrates. 


Example 1.9.2. Compute the determinant of the matrix 


2 —2. 7 
A=|0 —5 3 
0 O 4 


In addition, determine if A is invertible. 


Solution. Again using the definition, we see that 


det(A) =2det E A ~ (—2)det k ‘l +7det i al 
=2(-5-4—2-0)+2(0—0)+7(0—0) 
= 2(—5)(4) = —40 
Note particularly that the determinant of A is the product of its diagonal entries. 


Moreover, A clearly has a pivot position in every row, and so by this fact 
(or equivalently by the nonzero determinant of A) we see that A is invertible. 


In general, the determinant of any triangular matrix (one where all entries 
either below or above the diagonal are zero) is simply the product of its diagonal 
entries. There are other interesting properties that the determinant has, several 
of which are explored in the next example for the 2 x 2 case. 


[4 


Example 1.9.3 Let 
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be an arbitrary 2 x 2 matrix. Explore the effect of elementary row operations on 
the determinant of A. 


Solution. First, let us consider a row swap, calling A; the matrix 


7 
a=|‘ 4 


We observe immediately that det(A) = ad — bc and det(A;) = cb — ad = 
— det(A). 

We next consider scaling; let Az be the matrix whose first row is [ka kb], a 
scaled version of row 1 in A. We see that det(A2) = kad — kbc = k(ad — bc) = 
k- det(A). 

Finally, replacing, say, row 2 of A by the sum of k times row 1 with itself, 
we arrive at the matrix 


a b 
od ae ain 


Then det(A3) = a(d+ kb) — b(c+ka) = ad+ kab— bc —kab= ad — bc = det(A). 

Thus, we see that for the 2 x 2 case, swapping rows in a matrix changes 
only the sign of the determinant, scaling a row by a nonzero constant scales the 
determinant by the same constant, and executing a row replacement does not 
change the value of the determinant at all. These demonstrate the effect that the 
three elementary row operations from the process of row-reduction have on a 
2 x 2 matrix A. 


Given that the general definition of the determinant is recursive, it should not 
be surprising that the properties witnessed in example 1.9.3 can be shown to 
hold for n x n matrices. We state this result formally as our next theorem. 


Theorem 1.9.1 Let A be ann x n matrix and k a nonzero constant. Then 


a. If two rows of A are exchanged to produce matrix B, then 
det(B) = — det(A). 


b. If one row of A is multiplied by k to produce B, then det(B) = k det(A). 


c. If B results from a row replacement in A, then det(B) = det(A). 


Theorem 1.9.1 enables us to more clearly see the link between invertibility 
and determinants. Through a finite number of row interchanges and row 
replacements, any square matrix A may be row-reduced to upper triangular 
form U (where we have all subdiagonal zeros, but we do not necessarily scale to 
get 1’s on the diagonal). It follows from theorem 1.9.1 that 


det(A) = (—1)* det(U), 


where k is the number of row interchanges needed. Note that since U is 
triangular, its determinant is the product of its diagonal entries, and these entries 
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lie in the pivot locations of A. Thus, A has a pivot in every row if and only if this 
determinant is nonzero. Specifically, we have shown that A is invertible if and 
only if det(A) 4 0. 

To conclude this section, we note that linear algebra has once again afforded 
an alternate perspective on the problem of solving an n x n system of linear 
equations, and we can now add an additional statement involving determinants 
to the Invertible Matrix Theorem. 


Theorem 1.9.2. (Invertible Matrix Theorem) Let A be an n x n matrix. The 
following statements are equivalent: 


a. A is invertible. 

b. The columns of A are linearly independent. 

c. The columns of A span R”. 

d. A has a pivot position in every column. 

e. A has a pivot position in every row. 

f. A is row equivalent to I,,. 

g. For each b € R”, the equation Ax = b has a unique solution. 
h. det(A) 4 0. 


1.9.1 Determinants using Maple 


Obviously for most square matrices of size greater than 3 x 3, the computations 
necessary to find determinants are tedious and present potential for error. 
As with other concepts that require large numbers of arithmetic operations, 
Maple offers a single command that enables us to take advantage of the 
program’s computational powers. Given a square matrix A of any size, we simply 
enter 


> Determinant (A) ; 


As we explore properties of determinants in the exercises of this section, 
it will prove useful to be able to generate random matrices. Within the 
LinearAlgebra package in Maple, one accomplishes this for a 3 x 3 matrix 
with the command 


> RandomMatrix(3); 


For example, if we wanted to consider the determinant of a random matrix A 
we could enter the code 


> A := RandomMatrix(3); 
> det (A); 


See exercise 11 for a particular instance where this code will be useful. 
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Exercises 1.9 Compute (by hand) the determinant of each of the following 
matrices in exercises 1—7, and hence state whether or not the matrix is invertible. 


2 


N 


ioe) 


\o 


10. 


11. 


1 —3 


a: =) 


-3 1 0 | 
02 -4 O 
a 00 -—7 Ii 
00 0 6 
aad 
A=|b be 
coc f 
. I,, where I, is the n x n identity matrix. 


. For which value(s) of h is the matrix E | invertible? Explain your 


answer in at least two different ways. 


. For which value(s) of z is the matrix ? 7 e 2 . / invertible? Why? 


For which value(s) of z do nontrivial solutions x to the equation 
2-—Z 1 

1 2=—Z 

solution x to the equation. 


x = 0 exist? For one such value of z, determine a nontrivial 


In a computer algebra system, devise code that will generate two random 
3 x 3 matrices A and B, and that subsequently computes det(A), det(B), 
and det(AB). What theorem do you conjecture is true about the 
relationship between det(AB) and the individual determinants det(A) and 
det(B)? 
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12. 


13. 


14, 


13: 


16. 


17. 


18. 


19. 


20. 
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In a computer algebra system, devise code that will generate a random 

3 x 3 matrix A and that subsequently computes its transpose A‘, as well as 
det(A) and det(A!). What theorem do you conjecture is true about the 
relationship between det(A) and det(A‘)? 


Use the formula conjectured in exercise 11 above to show that if A is 


invertible, then det(A7!) . (Hint: AA7! =I.) 


1 
~ det(A) 
What can you say about the determinant of any square matrix in which 
one of the columns (or rows) is zero? Why? 


What can you say about the determinant of any square matrix where one 
of the columns (or rows) is repeated in the matrix? Why? 


Suppose that A is a m x n matrix and that Ax = 0 has infinitely many 
solutions. What can you say about det(A)? Why? 


Suppose that A? is not invertible. Can you determine if A is invertible or 
not? Explain. 


Two matrices A and B are said to be similar if there exists an invertible 
matrix P such that A= PBP~!. What can you say about the determinants 
of similar matrices? 


Let A be an arbitrary 2 x 2 matrix of the form 


[ea 


where a 4 0 and A is assumed to be invertible. Working by hand, row 
reduce the augmented matrix [A I] and hence determine a formula for 
A! in terms of the entries of A. What role does det(A) play in the formula 
for A~!? 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) Swapping the rows in a square matrix A does not change the value of 
det(A). 

(b) If A is a square matrix with a pivot in every column, then det(A) = 0. 

(c) The determinant of any diagonal matrix is the product of its diagonal 
entries. 

(d) If Ais an n x n matrix and Ax = b has a unique solution for every 
b € R”, then det(A) 4 0. 


1.10 The eigenvalue problem 


Another powerful characteristic of linear algebra is the way the subject often 
allows us to better understand an infinite collection of objects in terms of the 
properties ofa small, finite number of elements in the set. For example, if we have 
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a set of three linearly independent vectors that spans R?, then every vector in R? 
may be understood as a unique linear combination of the three special vectors 
in the linearly independent spanning set. Thus, in some ways it is sufficient to 
understand these three vectors, and to use that knowledge to better understand 
the rest of the vectors in R*. In a similar way, as we will see in this section, for 
an / X n matrix A there are up to n important vectors (called eigenvectors) that 
enable us to better understand a variety of properties of the matrix. 

The process of matrix multiplication enables us to associate a function with 
any given matrix A. For example, if A is a 2 x 2 matrix, then we may define a 
function T by the formula 


T(x) = Ax (1.10.1) 


Note that the domain of the function T is R’, the set of all vectors with two 
entries. Moreover, note that every output of the function T is also a vector in 
R?. We therefore use the notation T : R* > R?. This is analogous to familiar 
functions like f(x) = x”, where for every real number input we obtain a real 
number output (f : R > R); the difference here is that for the function T, for 
every vector input we get a vector output. In what follows, we go in search of 
special input vectors to the function T for which the corresponding output is 
particularly simple to compute. The next example will highlight the properties 
of the vector(s) we seek. 


Example 1.10.1 Explore the geometric effect of the matrix 


2 1 
az I; 
on the vectors u= [1 0]' andv=[1 1] from the perspective of the function 
T(x) = Ax. 


Solution. We first compute T(u) = Au = [2 1]!. In figure 1.15, we see a plot 
of the vector u on the left, and T(u) on the right. This shows that the geometric 
effect of T on u is to rotate u and stretch it. For the vector v, we observe 
that T(v) = Av = [3 3]!. Graphically, as shown in figure 1.16, it is clear that 
T(v) is simply a stretch of v by a factor of 3. Said slightly differently, we might 


write that 
T(v) = Av = =3 = 3v 
3 1 


This shows that the result of the function T (and hence the matrix A) being 
applied to the vector v is particularly simple: v is only stretched by T. 


For any m x n matrix A, there is an associated function T : R” — R” defined 
by T(x) = Ax. This function takes a given vector in R” and maps it to a 
corresponding vector in R”; in every case, we may view this output as resulting 
from the input vector being stretched and/or rotated. Input vectors that are 
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3 34 
| T(u) 
: _ — 
=3 3 =3 3 
3 3] 
Figure 1.15 The vectors u and T(u) in example 1.10.1. 
T(v 
: 7 (v) 
— v — 
VA : 
—- 
43 | t +B 3 


3. 34 


Figure 1.16 ‘The vectors v and T(v). 


only stretched have corresponding outputs that are simplest to determine: the 
input vector is simply multiplied by a scalar. To put this another way, for these 
stretched-only vectors, multiplying them by A is equivalent to multiplying them 
by a constant. Such vectors prove to be important for a host of reasons, and are 
called the eigenvectors of a matrix A. 


Definition 1.10.1 For a given n x n matrix A, a nonzero vector v is said to be 
an eigenvector of A if and only if there exists a scalar 1 such that 
Av =Av (1.10.2) 


The scalar A is called the eigenvalue corresponding to the eigenvector v. 
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In example 1.10.1, we found that the vector v= [1 1]! is an eigenvector 
of the given matrix A with corresponding eigenvalue 3 since Av = 3v. What is 
not yet clear is how we even begin to find eigenvectors and eigenvalues. We will 
soon see that some of the many different perspectives we can take on systems of 
linear equations will help us solve this problem. 

In general, given an n x n matrix A, we seek eigenvectors v that are, by 
definition, nonzero and satisfy the equation Av = Av. In one sense, what makes 
this problem challenging is that neither v nor A is initially known. We thus 
explore some different perspectives on the problem to see if we can highlight 
the role of either v or A. Early in this chapter, we spent significant effort 
studying homogeneous equations and the circumstances under which they have 
nontrivial solutions. Here, the eigenvector problem can be rephrased in a similar 
light. Subtracting Av from both sides of (1.10.2), we equivalently seek A and v 
such that 


Av—Av=0 (1.10.3) 
Viewing Av as (AT)v, we can factor (1.10.3) and write 
(A—ADv=0 (1.10.4) 


Now the question becomes, “for which values of A does (1.10.4) have a nontrivial 
solution?” At this point, we recall theorem 1.6.2, which tells us that the equation 
Bx = 0 has only the trivial solution if and only if the matrix B has a pivot in 
every column. To have a nontrivial solution, we therefore want A — AI to not 
have a pivot in every column. In (1.10.4), the matrix A — AI is square, so by the 
Invertible Matrix Theorem such a nontrivial solution exists if and only if A— AI 
is not invertible. 

This last observation brings us, finally, to determinants. As we saw in 
Section 1.9, a matrix is invertible if and only if its determinant is nonzero. 
Therefore, a nontrivial solution to (1.10.4) exists whenever i is such that 
det(A — AI) = 0. In the next example, we explore how this equation enables 
us to find the eigenvalues of a matrix A, and hence the eigenvectors as well. 


Example 1.10.2 Find the eigenvalues and eigenvectors of the matrix 
22 J 
hes | 
Solution. As seen in our preceding discussion, by the definition of eigenvalues 
and eigenvectors, A is an eigenvalue of A if and only if the equation (A—AI)v =0 


has a nontrivial solution. Note first that A — AI is the matrix A with the scalar A 
subtracted from each diagonal entry since 


sate? J-B J-[7 23 
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We next compute det(A — AI) so that we can see which values of 2 make this 
determinant zero. In particular, we have 


det(A ~ a1) = der] a 


i. 2=% 
=(2-A)?-1 
=)? —4, 43 (1.10.5) 


Thus, in order for det(A — AI) = 0, A must satisfy the equation A* — 44 + 
3 = 0. Factoring, (A — 3)(A — 1) = 0, and therefore 4 = 3 and A = 1 are 
eigenvalues of A. The value 4 = 3 is not surprising, given our earlier discoveries 
in example 1.10.1. 


Next, we proceed to find the eigenvectors that correspond to each eigenvalue. 
Beginning with A = 3, we seek nonzero vectors v that satisfy Av = 3v, or 
equivalently 


(A—31)v=0 


This problem is a familiar one: solving a homogeneous system of linear equations 
for which infinitely many solutions exist. Augmenting A — 3I with a column of 
zeros and row-reducing, we find that 


—1 1 0 = 1 —-1 0 

1 —-1 0 0 00 
Note that from the very definition of an eigenvector, by which we seek a 
nontrivial solution to (A — AI)v = 0, it must be the case at this point that the 
matrix A — AI does not have a pivot in every row. Interpreting the row-reduced 


matrix with the free variable v2, we find that the vector v =[v, 1]! must satisfy 
Vv, — v2 = 0. Thus, any vector v of the form 


is an eigenvector of A that corresponds to the eigenvalue 1 = 3. In particular, 
we observe that any scalar multiple of the vector v=[1 1] is an eigenvector of 
A with associated eigenvalue 3. We say that the set of all eigenvectors associated 
with eigenvalue 3 is the eigenspace corresponding to 4 = 3. 

It now only remains to find the eigenvectors associated with A = 1. We 
proceed in the same manner as above, now solving the homogeneous equation 
(A — 1I)v = 0. Row-reducing, we find that 


110; ;1 10 
1 10 0 0 0 
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and therefore the eigenvector v must satisfy vj + v2 = 0 and have the form 


Here, any scalar multiple of v=[—1 1]" is an eigenvector of A corresponding 
toA=1. 


There are several important general observations to be made from exam- 
ple 1.10.2. One is that for any 2 x 2 matrix, the matrix will have 0, 1, or 
2 real eigenvalues. This comes from the fact that det(A — AI) is a quadratic 
function in the variable 4, and therefore can have up to two real zeros. While 
it is possible to consider complex eigenvalues, we will wait until these arise 
in our study of systems of differential equations to address them in detail. In 
addition, we note that there are infinitely many eigenvectors associated with each 
eigenvalue. Often we will be interested in finding representative eigenvectors— 
ones for which all others with the same eigenvalue are linear combinations. 
Finally, it is worthwhile to note that the two representative eigenvectors found 
in example 1.10.2, corresponding respectively to the two distinct eigenvalues, are 
linearly independent. More on why this is important will be discussed at the end 
of this section; for now, we remark that it is possible to show that eigenvectors 
corresponding to distinct eigenvalues are always linearly independent. This fact 
will be proved in exercise 16. 

The observations in the preceding paragraph generalize to the case of 
n X n matrices. It may be shown that det(A — AI) is a polynomial of degree n 
in A. This function is usually called the characteristic polynomial; the equation 
det(A — AI) = 0 is typically referred to as the characteristic equation. Because 
the characteristic polynomial has degree n, it follows that A has up to n real 
eigenvalues?. 

Next we consider two additional examples that demonstrate some more of 
the possibilities and important ideas that arise in trying to find the eigenvalues 
and eigenvectors of a given matrix. 


Example 1.10.3 Determine the eigenvalues and eigenvectors of the matrix 


= 1/¥2 -1//2 
~ La/v2 1/v2. 


In addition, explore the geometric effect of the function T(v) = Rv on vectors 
in R?. 


> See appendix C for a review and discussion of important properties of roots of polynomial 
equations. 
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Solution. We consider the characteristic equation det(R — AI) = 0 and hence 
solve 


=—/2A+1 
By the quadratic formula, it follows that 


V24+/2—4 JS2#i/2 
7 2 7 2 


which shows that R does not have any real eigenvalues. If we explore the 
geometric effect of T(v) = Rv graphically, we can better understand why 
this is the case. Beginning with the vector e; = [1 0]! and computing 
Re = [1/2 1/./2]', as seen in figure 1.17, we see that the function T(x) = Rx 
rotates the vector e; counterclockwise by 2/4 radians, and (as computing the 
length of each vector shows) there is no stretching involved. Similarly, for 
the vector e2 = [0 1]!, we can see that Rey = [—1//2 1//2]". Just as with the 
previous vector e1, we see that the function T(v) = Rv simply rotates the vector 
e, counterclockwise by 2/4 radians. 

In fact, since every vector in IR? can be written as a linear combination of 
e; and e, it follows that the image Rv of any vector v is simply the original 
vector rotated counterclockwise 7/4 radians. This shows that no vector in R? 
is simply stretched under multiplication by R, and therefore R has no real 
eigenvectors. 


Xr 


Re) 


2 —2 


Figure 1.17 The vectors e; and T(e;) = Re. 
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Matrices such as R in example 1.10.3 with the property that they rotate every 
vector by a fixed angle (with no stretching factor) are usually called rotation 
matrices. 

Other interesting cases arise in the search for eigenvectors when some of 
the eigenvalues are repeated. That is, when a value A is a multiple root of the 
characteristic equation det(A — AI) = 0. We explore this further in the next 
example. 


Example 1.10.4 Determine all eigenvalues and eigenvectors of the matrix 


5 6 2 
A=|0 —-1 —8 
1 0 —-2 


Solution. As in previous examples, we first compute det(A — AI). Doing so 
and simplifying yields 


det(A — AI) = —36 + 1544+ 2a7—A3 
Factoring, it follows that 
det(A — AI) = —(44+.4)(A —3)? 


Setting the characteristic polynomial equal to zero, it is required that —(A + 4) 
(A. — 3)* = 0. This shows that A has two distinct eigenvalues; moreover, just as 
with zeros of polynomials, we say that A = —4 has multiplicity 1, while 4 = 3 
has multiplicity 2. 

We now find the eigenvectors corresponding to each eigenvalue. For 1 = 
—A4, we solve the equation (A + 4I)v = 0, and see by row-reducing that 


96 20 10 20 
03 -8 oJ >/0 1 -§ 0 
Pe 2:9 00 00 


Note that v3 is a free variable, and that the corresponding eigenvector v must 
have components which satisfy v) + 2v3 = 0 and v2 — Sys = 0, which shows that 


v has form 
i" 
V= V3 : 
L i] 
Likewise, for 4 = 3, we consider (A — 3I)v = 0, and row-reduce to find that 
2 6 2. 10-5 
0 -4 -8|—>]0 1 2 
1 OO -5 00 O 
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This leads us to see that the corresponding eigenvector has form 


5 
v=13| —2 
1 


Therefore, we see that for this matrix A, the matrix has two distinct eigenvalues 
(—4 and 3), and each of these eigenvalues has only one associated linearly 
independent eigenvector. That is, every eigenvector of A associated with 4 = —4 
is a scalar multiple of [—2 § 1] while every eigenvector associated with 4 = 3 


is a scalar multiple of [5 —2 i, 


In the three preceding examples, we have seen that an n x m matrix has up to 
n real eigenvalues. It turns out that there are also up to n linearly independent 
eigenvectors of the matrix. For many reasons, the best possible scenario is 
when a matrix has n linearly independent eigenvectors, such as the matrix A 
in example 1.10.2. In that 2 x 2 situation, A had two distinct real eigenvalues, 
and two corresponding linearly independent eigenvectors. One reason that this 
is so useful is that the eigenvectors are not only linearly independent, but also 
span R?. If we call the two eigenvectors found in example 1.10.2 u and v, 
corresponding to A = 3 and y = 1, respectively, then, since these two vectors 
are linearly independent in R* and span R?, we can write every vector in R? 
uniquely as a linear combination of u and v. 
In particular, given a vector x, there exist coefficients w and 6 such that 


x=au+ Bv 


If we are interested in computing Ax, we can do so now solely by knowing 
how A acts on the eigenvectors. Specifically, if we apply the linearity of matrix 
multiplication and the definition of eigenvectors, we have 


Ax = A(au+ fv) 
=aAu-+ BAv 
=ahu+ Buv 


This then reduces matrix multiplication essentially to scalar multiplication. 

In conclusion, we have seen in this section that via matrix multiplication, 
every matrix can be viewed as a function in the way that, through multiplication, 
it stretches and rotates vectors. Those vectors that are only stretched are 
called eigenvectors, and the factor by which the matrix stretches them are 
called eigenvalues. By knowing the eigenvalues and eigenvectors, we can better 
understand how A acts on an arbitrary vector, and, with some more sophisticated 
approaches, even further understand key properties of the matrix. Some of these 
properties will be studied in detail later in this text when we consider systems of 
differential equations. 
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1.10.1 Markov chains, eigenvectors, and Google 


Ina Markov process such as the one discussed in subsection 1.3.1 that represents 
the transition of voters from one classification to another, it is natural to wonder 
whether or not there is a distribution of voters for which the total number in 
each category will remain constant from one year to the next. For example, for 
the Markov process represented by 


x") = Mx”) (1.10.6) 
where M is the matrix 
0.95 0.03 0.07 
M=] 0.02 0.90 0.13 
0.03 0.07 0.80 


we can ask: is there a voter distribution x such that Mx = x? In light of our 
most recent work with eigenvalues and eigenvectors, we see that this question 
is equivalent to asking if the matrix M has 4 = 1 as an eigenvalue with some 
corresponding eigenvector that can represent a voter distribution. 

If we compute the eigenvalues and eigenvectors of M, we find that the 
eigenvalues are A = 1.000, 0.911, 0.739. The eigenvector corresponding to A = 1 
is v = [0.770 0.558 0.311]. Scaling v so that the sum of its entries is 250, we 
see that the eigenvector 


v =[117.450 85.113 47.437]" 


represents the distribution of a population of 250000 people in such a way that 
the total number of Democrats, Republicans, and Independents does not change 
from one year to the next, under the hypothesis that voters change categories 
annually according to the likelihoods expressed in the Markov matrix M. This 
eigenvector is sometimes also called a stationary vector. 

Remarkably, we can also note that in our earlier computations in 
subsection 1.3.1 for this Markov chain, we observed that the sequence of vectors 
x) x@__..,x(20 was approaching a single vector. In fact, the limiting value 
of this sequence is the eigenvector v = [117.450 85.113 47.437|!. That this 
phenomenon occurs is the result of the so-called Power method, a rudimentary 
numerical technique for computing an eigenvalue—eigenvector pair of a matrix. 
More about this concept can be studied in the project on discrete dynamical 
systems found in section 1.13.3. 


Example 1.10.5 Find the stationary vector from the matrix in example 1.3.3. 
Solution. Under the assumptions stated in example 1.3.3, we saw that the 


migration of citizens from urban to suburban areas of a metropolitan area, or 
vice versa, were modeled by the Markov process x‘"+!) = Mx”) where M is the 


matrix 
0.85 0.08 
i Re | 
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Solving the equation x = Mx by writing (M — I)x = 0, we see that we need to 
find the eigenvector of x that corresponds to A = 1. Doing so, we find that the 
eigenvector is 


Scaling this vector so that the sum of its entries is one, we see that the population 
stabilizes when it is distributed with 34.78 percent in the city and 65.22 percent 
in the suburbs, in accordance with the vector [0.3478 0.6522]!. 


One of the most stunning applications of eigenvalues and eigenvectors can be 
found on the World Wide Web. In particular, the idea of finding a stationary 
vector that satisfies Mx = x is at the center of Google’s Page Rank Algorithm 
that it uses to index the importance of billions of pages on the Internet. What is 
particularly challenging about this problem is the fact that the stochastic matrix 
M used by the algorithm is a square matrix that has one column for every page 
on the World Wide Web that is indexed by Google! In early 2007, this meant 
that M was a matrix with 25 billion columns. Nonetheless, properties of the 
matrix M and sophisticated numerical algorithms make it possible for modern 
computers to quickly find the stationary vector of M and hence provide the user 
with the results we have all grown accustomed to in using Google.® 


1.10.2 Using Maple to find eigenvalues and 
eigenvectors 


Due to its reliance upon determinants and the solution of polynomial 
equations, the eigenvalue problem is computationally difficult for any case 
larger than 3 x 3. Sophisticated algorithms have been developed to compute 
eigenvalues and eigenvectors efficiently and accurately. One of these is the so- 
called QR algorithm, which through an iterative technique produces excellent 
approximations to eigenvalues and eigenvectors simultaneously. 

While Maple implements these algorithms and can find both eigenvalues 
and eigenvectors, it is essential that we not only understand what the program 
is attempting to compute, but also how to interpret the resulting output. 
As always, in what follows we are working within the LinearAlgebra 
package. 

Given an n x n matrix A, we can compute the eigenvalues of A with the 
command 


> Eigenvalues (A) ; 


® A detailed description of how the Page Rank Algorithm works and the role that eigenvectors play 
may be read at http: //www.ams.org/featurecolumn/archive/pagerank.html. 
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Doing so for the matrix 


[i 


from example 1.10.2 yields the Maple output 


hi] 


Despite the vector format, the program is telling us that the two eigenvalues 
of the matrix A are 3 and 1. If we desire the eigenvectors, too, we can use the 
command 


> Eigenvectors (A); 


which leads to the output 


3 1 -l 
a] Lo 
Here, the first vector tells us the eigenvalues of A. The following matrix holds the 
corresponding eigenvectors in its columns; the vector [1 1]" is the eigenvector 
corresponding to 4 = 3 and [—1 1]! corresponds to 4 = 1. 
Maple is extremely powerful. It is not at all bothered by complex numbers. 


So, if we enter a matrix like the one in example 1.10.3 that has no real eigenvalues, 
Maple will find complex eigenvalues and eigenvectors. To see how this appears, 


we enter the matrix 
—_ Lis? =i 
~ Li/v2 1/v2. 


and execute the command 
> Eigenvectors(R); 


The resulting output is 
wae ante | 
aati) 2 


Note that here Maple is using ‘P to denote not the identity matrix, but rather 
/—1. Just as we saw in example 1.10.3, R does not have any real eigenvalues. We 
can use familiar properties of complex numbers (most importantly, [7 = 1) to 
actually check that the equation Ax = Ax holds for the listed complex eigenvalues 
and complex eigenvectors above. However, at this point in our study, these 
complex eigenvectors are of less importance, so we defer further details on them 
until later work with systems of differential equations. 

One final example is relevant here to see how Maple deals with repeated 
eigenvalues and missing eigenvectors. If we enter the 3 x 3 matrix A from 
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example 1.10.4 and execute the Eigenvectors command, we receive the 
output 


3 5 0 -—2 

8 
3},/-20 & 
=4 os | 


Here we see that 3 is a repeated eigenvalue of A with multiplicity 2. The 
first two columns of the matrix in the output contain the (potentially) 
linearly independent eigenvectors which correspond to this eigenvalue. The 
second column of all zeros indicates that A has only one linearly independent 
eigenvector corresponding to this particular eigenvalue. The third column, of 
course, is the eigenvector associated with the eigenvalue A = —4. The column 
of all zeros also demonstrates that R* does not have a linearly independent 
spanning set that consists of eigenvectors of A. 


Exercises 1.10 In exercises 1-8, compute (by hand) the eigenvalues and any 
corresponding real eigenvectors of the given matrix A. 


oO 
NO 
me 


10. 


1 


— 


12. 


13. 
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. A2 x 2 matrix A has eigenvalues 5 and —1 and corresponding 


eigenvectors u = [0 1]' andv =[1 0]!. Use this information to compute 
Ax, where x is the vector x = [—5 4]!. 


A2 x 2 matrix A has eigenvalues —3 and —2 and corresponding 
eigenvectors u = [—1 1]! andv=[1 1]!. Use this information to 
compute Ax, where x is the vector x = [—3 5]. 


. Consider the matrix 


(a) Determine the eigenvalues and eigenvectors of A. 
(b) Does R? have a linearly independent spanning set that consists of 


eigenvectors of A? 
3 -1 
[2] 


(a) Determine the eigenvalues and eigenvectors of A, and show that A has 
two linearly independent eigenvectors. 

(b) Let P be the matrix whose columns are two linearly independent 
eigenvectors of A. Why is P invertible? 

(c) Let D be the diagonal matrix whose diagonal entries are the 
eigenvalues of A; place the eigenvalues on the diagonal in an order 
corresponding to the order of the eigenvectors in the columns of P, 
where P is the matrix defined in (b) above. Compute AP and PD. 
What do you observe? 

(d) Explain why A= PDP™!. Use this factorization to compute A”, A®, and 
A!° in terms of P, D, and P~!. In particular, explain how A!® can be 
easily computed by using the diagonal matrix D along with P and P~!. 


Consider the matrix 


Consider the matrix 


3 -1 1 
A=];-l1 3 -l 
1 -l 3 


(a) Determine the eigenvalues and eigenvectors of A, and show that A has 
three linearly independent eigenvectors. 

(b) Let P be the matrix whose columns are three linearly independent 
eigenvectors of A. Why is P invertible? 

(c) Let D be the diagonal matrix whose diagonal entries are the 
eigenvalues of A; place the eigenvalues on the diagonal in an order 
corresponding to the order of the eigenvectors in the columns of P, 
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14, 


15. 


16. 


17. 


18. 


19. 


20. 
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where P is the matrix defined in (b) above. Compute AP and PD. 
What do you observe? 

(d) Explain why A= PDP™!. Use this factorization to compute A, A°, 
and A!° in terms of P, D, and P7!. 


Prove that an n x n matrix A is invertible if and only if A has no eigenvalue 
equal to zero. 


Show that if A, B, and P are square matrices (with P invertible) such that 
B = PAP~!, then A and B have the same eigenvalues. (Hint: consider the 
characteristic equation for PAP~!.) 


Prove that if A is a 2 x 2 matrix and v and wu are eigenvectors of A 
corresponding to distinct eigenvalues 4 and jz, then v and u are linearly 
independent. (Hint: suppose to the contrary that v and w are linearly 
dependent.) 


For a differentiable function y, denote the derivative of y with respect to x 
by D(y). Now consider the function y = e”*, and compute D(y). For what 
value of A is D(y) = Ay? Explain how this value behaves like an eigenvalue 
of the operator D. What is the corresponding eigenvector? How does the 
problem change if we consider y = e™ for any other real value of r? 


For a vector-valued function x(t), let the derivative of x with respect to t 
be denoted by D(x). For the function 


ot 
x(t) =| _3,2t 


compute D(x). For what value(s) of A is D(x) = Ax? Explain how it 
appears from your work that the operator D has an eigenvalue-eigenvector 
pair. 


Suppose that for a large population that stays relatively constant, people 
are classified as living in urban, suburban, or rural settings. Moreover, 
assume that the probabilities of the various possible transitions are given 
by the following table: 


Future location (J)/current location (+) | U(%) | S(%) | R(%) 


Urban 90 3 2 
Suburban 7 96 10 
Rural 3 1 88 


Given that a population of 250 million is present, is there a stationary 
vector that reveals a population which does not change from year to year? 


Car-owners can be grouped into classes based on the vehicles they own. 
A study of owners of sedans, minivans, and sport utility vehicles shows 
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that the likelihood that an owner of one of these automobiles will replace it 
with another of the same or different type is given by the table 


Future vehicle ({)/ 
current vehicle (—) | Sedan(%) | Minivan(%) | SUV(%) 


Sedan 91 3 2 
Minivan 7 95 8 
SUV 2 2 90 


If there are currently 100 000 vehicles in the population under study, is 
there a stationary vector that represents a distribution in which the 
number of owners of each type of vehicle will not change as they replace 
their vehicles? 


21. Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If x is any vector and 4 is a constant such that Ax = Ax, then x is an 
eigenvector of A. 

(b) If Ax = 0 has nontrivial solutions, then 4 = 0 is an eigenvalue of A. 

(c) Every 3 x 3 matrix has three real eigenvalues. 

(d) If Ais a2 x 2 matrix, then A can have up to two real linearly 
independent eigenvectors. 


1.11 Generalized vectors 


Throughout our work with vectors in R", we have regularly used several key 
algebraic properties they possess. For example, any two vectors u and v can be 
added to form a new vector u+ v, any single vector can be multiplied by a scalar 
to determine a new vector cu, and there is a zero vector 0 with the property 
that for any vector v, v + 0 = v. Of course, we use other algebraic properties of 
vectors as well, often implicitly. 

Other sets of mathematical objects behave in ways that are algebraically 
similar to vectors. The purpose of this section is to expand our perspective 
on what familiar mathematical entities might also reasonably be called vectors; 
much of this expanded perspective is in anticipation of our pending work with 
differential equations and their solutions. We motivate our study with several 
familiar examples, and then summarize a collection of formal properties that all 
these examples share. 


Example 1.11.1 Let M2 x2 denote the collection of all 2 x 2 matrices with real 
entries. Show that if A and B are any 2 x 2 matrices and c € R, then A+ B and 
cA are also 2 x 2 matrices. In addition, show that there exists a “zero matrix” Z 
such that A+ Z = A for every matrix A. 
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Solution. Let 
A=|@! 92] andBe= bi D2 
a1 a22 by by2 
By the definition of matrix addition, 


aitby ai2+ | 


A+B= 
7 Ee + by a2 + bop 


and thus we see that A+ B is also a 2 x 2 matrix. Recall that it only makes sense 
for matrices of the same size to be added; here we are simply pointing out the 
obvious fact that the sum of two matrices of the same size is yet another matrix 
of the same size. In the same way, 


h= i eA 
ca2, Ccan2 
which shows that not only is the scalar multiple defined, but also that cA is a 
2 x 2 matrix. Finally, if we let Z be the 2 x 2 matrix all of whose entries are zero, 


2=[0 0 


then our work with matrix sums shows us immediately that A+ Z = A for every 
possible 2 x 2 matrix A. 


Certainly, we can see that there is nothing particularly special about the 2 x 2 
case in this example; the same properties will hold for Mim for any positive 
integer values of m and n. 

Mathematicians often use the language “My, is closed under addition 
and scalar multiplication” and “M2 x2 contains a zero element” to describe 
the observations we made in example 1.11.1. Specifically, to say that a set is 
closed under an operation means simply that if we perform the operation on 
an appropriate number of elements from the set, the result is another element 
in the set. We next consider several more examples of sets that demonstrate the 
properties of being closed and having a zero element. 


Example 1.11.2 Let P2 denote the set of all polynomials of degree 2 or less. 
That is, P2 is the set of all functions of the form 


p(x) = ax? + ax + a9 


where do, 4|,da2 € R. Show that Pz is closed under addition and scalar 
multiplication, and that P2 contains a zero element. 


Solution. Before we formally address the stated tasks, let us remind ourselves 
how we add polynomial functions. If we are given, say, f(x) = 2x*—5x+1lland 
g(x) = 4x — 3, we compute (f + g)(x) = f(x) + g(x) = 2x? —5x+11+4x—3. 
We can then add like terms to simplify and find that (f + g)(x) = 2x? —x+8. 
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Similarly, if we wanted to compute (—3f)(x), we have (—3f)(x) = —3f(x) = 
—3(2x? — 5x +11) = —6x? + 15x — 33. 

We now show that P is indeed closed under the operations of addition 
and scalar multiplication. Given two arbitrary elements of P2, say f (x) = ax? + 
ax + ay and g(x) = box? + b,x + bp, it follows upon adding and combining 
like terms that 


(f + g)(x) = (ay + by) x? + (a, + by) x + (ay + bo) 


which is obviously a polynomial of degree 2 or lower, and thus f + g is an 
element of P . In the same way, for any real value c, 


(cf) (x) = can x” + cay x + cay 


which also belongs to P». Finally, it is evident that if we let z(x) = 0x? + 0x +0 
(i.e., (x) is the zero function), then (f + z)(x) = f(x) for any choice of f in P). 


Here, too, we should observe that while these properties hold for P2, there is 
nothing special about the 2. In fact, P,, (the set of all polynomials of degree n or 
less) has the exact same properties. Even P, the set of all polynomials, behaves 
in the same manner. 


Example 1.11.3 From calculus, consider the set C[—1, 1] of all continuous 
functions on the interval [—1, 1]. That is, 


C[-1, 1] = {f | f is continuous on [—1, 1]}. 


Show that C[—1, 1] is closed under addition and scalar multiplication, and also 
that C[—1, 1] contains a zero element. 


Solution. Two standard facts from calculus tell us that the sum of any two 
continuous functions is also a continuous function and that a constant multiple 
of a continuous function is also a continuous function. Thus C[—1, 1] is 
closed under addition and scalar multiplication. Furthermore, the zero function 
z(x) = 0is itself continuous, which shows that C[—1, 1] indeed has a zero element. 


One of the principal reasons that we are shifting our attention from vectors in R” 
to this more generalized concept of vector where the objects under consideration 
are often functions is the fact that our focus in subsequent chapters will be solving 
differential equations. The solution to a differential equation is a function that 
makes the equation true. Moreover, we will also see that for certain important 
classes of differential equations, there are multiple solutions to the equation and 
that often these solution sets are closed under addition and scalar multiplication 
and also contain the zero function. 

From each of the above examples, we see that R” has many important 
properties that we can consider in a broader context. We therefore introduce 
the notion ofa vector space, which is a set of objects that have defined operations 
of addition and scalar multiplication that satisfy the list of ten rules below. The 
concept of a vector space is a generalization of R”. 
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While many of the rules are technical in nature, the most important ones 
to verify turn out to be the three that we have focused on so far: being closed 
under addition, closed under scalar multiplication, and having a zero element. 
All three sets described in the above examples are vector spaces, as is R”. 


Definition 1.11.1 A vector space is a nonempty set V of objects, on which 
operations of addition and scalar multiplication are defined, where the objects 
in V (called vectors) adhere to the following ten rules: 


1. For every u and v in V, the sum u+ vis in V (V is “closed under vector 
addition”) 


2. For every uand v in V,u+v=v-+u (“vector addition is commutative”) 


3. For every u,v, w in V, (u+-v) +w=v-+ (u+w) (“vector addition is 
associative”) 


4. There exists a zero vector 0 in V such that u+ 0 = u for every u € V (0 is 
called the additive identity of V) 


5. For every u € V, there is a vector —u such that u+ (—u) = 0 (—uw is called 
the additive inverse of u) 


6. For every u € V and every scalar c, the scalar multiple cu € V (V is 
“closed under scalar multiplication”) 


7. For every u and v in V and every scalar c, c(u+v) = cu+ cv (“scalar 
multiplication is distributive over vector addition”) 


8. For every u € V and scalars c and d, (c+ d)u=cu+du 
9. For every u € V and scalars c and d, c(du) = (cd)u 


10. For everyue V, lu=u 


Sometimes we can take a sub-collection (i.e., a subset) of the vectors in a 
vector space, and that smaller set itself acts like a vector space. For example, the 
set of all polynomial functions is a vector space. If we take just the polynomials 
of degree 2 or less (as in example 1.11.2 above), that subset is itself a vector 
space. This leads us to introduce the notion of a subspace. 


Definition 1.11.2 Given a vector space V, let H be a subset of V (ie., 
every object in H is also in V.) There are then operations of addition and 
scalar multiplication on objects in H: specifically, the same addition and scalar 
multiplication as on the objects in V. We say H is a subspace of V if and only if 
all three of the following conditions hold: 


1. H is closed under addition 
2. H is closed under scalar multiplication 


3. H contains the zero element of V 
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We close this section with two important examples of subspaces. The first 
is a subspace of R” associated with a given matrix A. The second is a subspace 
of the set of all continuous functions on [—1, 1]. 


Example 1.11.4 Recall the matrix A from example 1.10.4 in section 1.10, 


5 6 2 
A=]0 -l -8 
1 0 —2 


Show that the set of all eigenvectors that correspond to a given eigenvalue of A 
forms a subspace of R?. 


Solution. In example 1.10.4, we saw that the eigenvalues of A are A = —4 (with 
multiplicity 1) and A = 3 (with multiplicity 2). In addition, the corresponding 
eigenvectors are v = [—2 § 1]! forA =—4andv=[5 —2 1]! forA =3.In 
particular, recall that every scalar multiple of v,;——4 is also an eigenvector of A 
corresponding to 4 = —4. We now show that the set of all these eigenvectors 
corresponding to A = —4 is a subspace of R°. 

Let E,=~4 denote the set of all vectors v such that Av = —4v. First, certainly 
it is the case that AO = —40. This shows that the zero element of R? is an 
element of E,—_4. Furthermore, we have already seen that every scalar multiple 
of an eigenvector is itself an eigenvector, and thus E,—~4 is closed under scalar 
multiplication. Finally, suppose we have two vectors x and y such that Ax = —4x 
and Ay = —4y. Observe that by properties of linearity, 


A(x+y) = Ax+ Ay 
= —4x — 4y 
= —A(x+y) 
which shows that (x +y) is also an eigenvector of A corresponding to 4 = —4. 
Therefore, E,— 4 is closed under addition. 


This shows that E,—_ 4 is indeed a subspace of R?. Ina similar fashion, E,—3 
is also a subspace of R?. 


Our observations for the eigenspaces of the 2 x 2 matrix A in example 1.11.4 
hold in general for any n x n matrix A: the set of all eigenvectors corresponding 
to a given eigenvalue of A forms a subspace of R”. 


Example 1.11.5 Show that the set of all linear combinations of the sine and 
cosine functions is a subspace of the vector space C of all continuous functions. 


Solution. We let C denote the vector space of all continuous functions, and 
now let H be the subset of C which is defined to be all functions that are linear 
combinations of sin t and cos t. That is, a typical element of H is a function f of 
the form 

f(t)=cqsint+ qcost 
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where c; and c are any real scalars. We need to show that the set H contains the 
zero function from C, that H is closed under scalar multiplication, and that H 
is closed under addition. 

First, if we choose c; = cp = 0, the function z(t) = Osint + Ocost = 0 is the 
function that is identically zero, which is the (continuous) zero function from C. 
Next, if we take a function from H, say f(t) = c, sint + ~ cost, and multiply it 
by a scalar k, we get 


kf (t) = k(c, sint + c. cost) = (kc,) sint + (kez) cost 


which is of course another element in H, so H is closed under scalar 
multiplication. Finally, if we consider two elements f and g in H, given by 
f(t) =c,sint+ cost and g(t) = d; sint + d) cost, then it follows that 


f(t)+ g(t) = (aq sint + ~ cost) + (d) sint + d) cos f) 
=(q4+d,)sint+(c@+d) cost 


so that H is closed under addition, too. Thus, H is a subspace of C. 


In fact, it turns out that the subspace considered in example 1.11.5 contains all 
of the solutions to a familiar differential equation. We will revisit this issue in 
example 1.11.7. It is also instructive to consider an example of a set that is not a 
subspace. 


Example 1.11.6 Consider the vector space C[—1, 1] ofall continuous functions 
on the interval [—1, 1]. Let H be the set of all functions with the property that 
f(—1) =f (1) =2. Determine whether or not H is a subspace of C[—1, 1]. 


Solution. The set H does not satisfy any of the three required properties 
of subspaces, so any one of these suffices to show that H is not a subspace. In 
particular, the zero function z(t) = 0 does not have the property that z(—1) = 2, 
and thus the zero function from C[—1, 1] does not lie in H, so H is nota subspace. 
We could also observe that any scalar multiple of a function whose value at 
t =—1and t = 1 is 2 will result in a new function whose value at these points is 
not 2; similarly, the sum of two functions whose values at t = —1 and t = 1 are 2 
will lead to a new function whose values at these points is 4. These facts together 
show that H is not closed under scalar multiplication, nor under addition. 


As we have already mentioned, we are considering this generalization of the 
term vector to include mathematical objects like functions because this structure 
underlies the study of differential equations, and this vector space perspective 
will help us to better understand a variety of key ideas when we are solving 
important problems later on. To foreshadow these coming ideas, we present 
an example of an elementary differential equation that shows how the set 
of solutions to the equation is in fact the subspace of continuous functions 
considered in example 1.11.5. 
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Example 1.11.7 Consider the differential equation 
y+y=0 


Show that y; = sint and y, = cost are solutions to this differential equation, 
and that every function of the form y = c,y, + 2 is a solution as well. 


Solution. This example is very similar to example 1.6.4. Because of its 
importance, we discuss the current problem in full detail here as well. 

For any equation, a solution is an object that makes the equation true. In 
the above differential equation, y represents a function. The equation asks “for 
which functions y is the sum of y and its second derivative equal to zero?” 


Observe first that if we let y; = sint, then y; = cost, so y;’ = —sint, and 
therefore y;' + y, = —sint + sint = 0. In other words, y; is a solution to the 
differential equation. Similarly, for y, = cost, y; = —sint and ys’ = — cost, so 


that y’ + yz = — cost + cost = 0. Thus, yp is also a solution to the differential 
equation. 

Now, consider any function y of the form y = c,y; + @y2. That is, let y be any 
linear combination of the two solutions we have already found. We then have 


y=csint+ cost 


so that, using standard properties of the derivative (properties which are linear 
in nature), it follows that 


y' =c,cost — ~sint 


and 


y” =—qsint — ~ cost 


We, therefore see that 
y" +y=(-csint — ~ cost) + (c,sint + ~ cost) 
=—c,sint+c,sint — @cost+ «cost 
=0 


so that y is indeed also a solution of y” + y = 0. 


In example 1.11.7, we find a large number of connections to our work in 
systems of linear equations and linear algebra: properties of linearity, linear 
combinations of vectors, homogeneous equations, infinitely many solutions, 
and more. In particular, the set of all solutions to the differential equation in 
example 1.11.7 is precisely the subspace of continuous functions examined in 
example 1.11.5. Certainly, we will revisit these topics in greater detail as we 
progress in our study of differential equations. 


Exercises 1.11 In exercises 1-16, determine whether or not the set H is a 
subspace of the given vector space V. If H is a subspace, show that it satisfies the 
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three required properties stipulated by the definition; if not, show at least one 
example of why at least one of the properties does not hold. 


i, v=rH={[*]:s>0y>0] 
2. v=RH={|7] x.y =o} 


2 
3.V=R3,H=it|] O|:teER 

=I 

2 1 
4.V=R3,H=?t}] O|]4+]1]:teR 

=I 1 


_ 2-1 Jl & 
7.V=R‘,H= “fo Ax = b where A= E 6 3 and b= 15 
8. V=R’*,H= | 


2 -l 0 
x: Ax =b where A= E | ndb=[ 9] 


9.V=Mhoy2, H ={A€ M2,2: A is invertible} 
10. V=Moy2, H={A€ Mox2: A is not invertible} 


11. V= Mana H= {Ae Masai A=|f "|| 


c 


12. V= Masa H= {Ae Masai A=|f ‘|| 


Cc 
13. V=C[-1,1], H={f €C[-1, 1]:f(-1)=0} 
14. V=C[-1, 1], H={f €C[-1, 1]: f(-1)=5} 
15. V=C[-1, 1], H={f €C[-1,1]:f'+f =0} 


16. V=C[-1,1, H={f eC[-1, 1: f’+f=1} 


17. Recall that for a given eigenvalue A of a matrix A, the eigenspace associated 
to that eigenvalue is the set of all eigenvectors that correspond to i. For the 


matrix A= E - , describe all of the eigenspaces of A. 


18. For the matrix A = E ,} describe all of the eigenspaces of A. 


19. 


20. 


2 


pan 


22. 


23, 
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Explain why for any set of vectors {u, v} in R”, Span{u, v} is a subspace of 


R”. Similarly, explain why Span {vj,..., vz} is a subspace of R” for any set 
{v1, see , Vii: 
2a+b 
Let V =R?’ and H = a—b1}:a,be€R}. Determine vectors u and 
3a+5b 


v so that H can be expressed as the set Span{u, v}, and hence explain why 
H isa subspace of R°. 


2a+b 


. Let V=R? and H = —2 | :a,beRf. Explain why H is nota 


3a+5b 
subspace of R?. 
Let A be an m x n matrix. The null space of the matrix A, denoted Nul(A) 


is the set of all solutions to the equation Ax = 0. Explain why Nul(A) is a 
subspace of R”. 


Let A be an m x n matrix. The column space of the matrix A, denoted 
Col(A) is the set of all linear combinations of the columns of A. Explain 
why Col(A) is a subspace of R”. 


In exercises 24-27, use the definitions of the null space Nul(A) and column 
space Col(A) of a matrix given in exercises 22 and 23. 


24, 


25. 


26. 


27. 


28. 


29, 


21-1 
1 3 4 
answer clearly. In addition, describe all vectors that belong to Nul(A) as 
the span of a finite set of vectors. 


Let A= i Is the vector v= [—2 1 1]' in Nul(A)? Justify your 


1 -—2 
Let A= 3. 1. Is the vector v =[—2 1 1]! in Col(A)? Justify your 
-4 0 


answer. Is the vector u=[—1 4 —4]! in Col(A)? In addition, describe all 
vectors that belong to Col(A) as the span of a finite set of vectors. 


Given a matrix A and a vector Vv, is it easier to determine whether v lies in 
Nul(A) or Col(A)? Why? 


Given a matrix A and a vector Vv, is it easier to describe Nul(A) or Col(A) 
as the span of a finite set of vectors? Why? 


Consider the differential equation y’ = 3y. Explain why any function of 
the form y = Ce* is a solution to this equation. Is the set of all these 
solutions a subspace of the vector space of continuous functions? 


Consider the differential equation y’ = 3y — 3. Explain why any function 
of the form y = Ce*! + 1 is a solution to this equation. Is the set of all these 
solutions a subspace of the vector space of continuous functions? 
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30. Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) If H is a subspace of a vector space V, then H is itself a vector space. 

(b) If H is a subset of a vector space V, then H is a subspace of V. 

(c) The set of all linear combinations of any two vectors in R? is a 
subspace of R?. 

(d) Every nontrivial subspace of a vector space has infinitely many 
elements. 


1.12 Bases and dimension in vector spaces 


In section 1.11, we saw that some common sets we encounter in mathematics are 
very similar to R”. For instance, the set M22 ofall 2 x 2 matrices, the set P2 ofall 
polynomials of degree 2 or less, and the set C[—1, 1] of all continuous functions 
on [—1, 1] are sets that contain a zero element, are closed under addition, and 
are closed under scalar multiplication. In addition, because they each satisfy the 
other required seven characteristics we noted, these sets are all vector spaces. We 
specifically observe that this enables us to take linear combinations of elements 
of a vector space, because addition and scalar multiplication are defined and 
closed in these collections of objects. 

Every vector space has further characteristics that are similar to R”. 
For example, it is natural to discuss now-familiar concepts such as linear 
independence and span in the context of the more generalized notion of vector. 
As we will see, the definitions of these terms in the setting of vector spaces are 
almost identical to those we encountered earlier in R”. Moreover, just as we can 
frequently describe sets in R” in terms of a small number of special vectors, we 
will find that this often occurs in general vector spaces. 

We begin by updating two key definitions. 


Definition 1.12.1 Ina vector space V, given a set S = {v1,..., vg} where each 
vector v; € V, the set S is linearly dependent if there exists a nontrivial solution 
to the vector equation 


X1V1 + 22V2+--- + xv, = 0 (1.12.1) 


If (1.12.1) has only the trivial solution (x; = --- = x, = 0), then we say the set S 
is linearly independent. 


The only difference between this definition and definition 1.6.1 that we 
encountered in section 1.6 is that R” has been replaced by V. Just as with 
vectors in R”, it is an equivalent formulation to say that a set S in a vector space 
V is linearly independent if and only if no vector in the set may be written as a 
linear combination of the other vectors in the set. 

We can also define the span of a set of vectors in a vector space V. 
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Definition 1.12.2 Ina vector space V, given a set of vectors S = {v,..., Vx}, 
vi € V, the span of S, denoted Span(S) or Span{vi, ..., vz}, is the set of all linear 
combinations of the vectors v,,..., vz. Equivalently, Span(S) is the set of all 
vectors y of the form 


YH QVjit+-++- + CVE, 


where c),..., c, are scalars. We also say that Span(S) is the subset of V spanned 
by the vectors Vv), ..., Vk. 


In example 1.6.3 in section 1.6, we studied three sets R, S, and T in R?.R 
contained two vectors and was linearly independent but did not span R*; S 
contained three vectors, was linearly independent, and spanned R*; and T 
consisted of four vectors, was linearly dependent, and spanned R°. In that 
setting, we came to see that the set S was in some ways the best of the three: 
it had both key properties of being linearly independent and a spanning set. In 
other words, the set had enough vectors to span R>, but not so many vectors as 
to generate redundancy by being linearly dependent. 

Through the next definition, we will now call such a set a basis, even in the 
generalized setting of vector spaces and subspaces. 


Definition 1.12.3. Let V be a vector space and H a subspace of V. A set B= 
{v1,V2,.--, Vx} of vectors in H is called a basis of H if and only if B is linearly 
independent and Span(G) = H. That is, GB is a basis of H if and only if it is a 
linearly independent spanning set. 


Several examples now follow that use the terminology of linear indepen- 
dence, span, and basis in the context of different vector spaces. 


Example 1.12.1 In the vector space P ofall polynomials, consider the subspace 
H =P) ofall polynomials of degree 2 or less. Show that the set B = {1, t, t7} is 
a basis for H. Is the set {1, t, t?, 4 — 3f} also a basis for H? 


Solution. To begin, we observe that every element of H = P> is a polynomial 
function of the form p(t) = ag + ayt + agt?. In particular, every element of 
P> is a linear combination of the functions 1, t, and ¢?, and therefore the set 
B={I, t, t?} spans H. 
In addition, to determine whether the set B is linearly independent, we 
consider the equation 
atat+oat? =0 (1.12.2) 
and ask whether or not this equation has a nontrivial solution. Keeping in 
mind that the ‘0’ on the right-hand side represents the zero function in P9, the 
function that is everywhere equal to zero, we can see that if at least one of co, 
C1, OF © is nonzero, we will be guaranteed to have either a nonzero constant 
function, a linear function, or a quadratic function, thus making c+ ct + c t? 
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not identically zero. This shows that (1.12.2) has only the trivial solution, and 
therefore the set B = {1, t, t?} is linearly independent. Having shown that B is 
a linearly independent spanning set for H = P2, we can conclude that B is a 
basis for H. 

On the other hand, the set {1, f, t7,4— 3t} is not a basis for H since we can 
observe that the element 4 — 3t is a linear combination of the elements 1 and t: 
4—3t=4-1—3-t. This shows that the set {1, t, t?, 4 —3t} is linearly dependent 
and thus cannot be a basis. 


Example 1.12.2 Consider the set H of all functions of the form y = qj sint + 
c cost. In the vector space C of all continuous functions, explain why the set 
B = {sint, cost} is a basis for the subspace H. 


Solution. First, we recall that H is indeed a subspace of C[—1, 1] due to our 
work in example 1.11.5. 

By the definition of H (the set of all functions of the form y = c, sint + 
c cost), we see immediately that B is a spanning set for H. In addition, it is 
clear that the functions sint and cost are not scalar multiples of one another: 
any scalar multiple of sint is simply a vertical stretch of the function, which 
cannot result in cos t. This tells us that the set B = {sin t, cos t} is also linearly 
independent, and therefore is a basis for H. 


Example 1.12.3. In IR3, consider the set B = {e1, e2, e3}, where e; =[1 0 0], 
e> =[0 1 0], ande3=[0 0 1)". Explain why B is a basis for R’. 

Is the set S = {v,, v2, v3}, where v} =[1 2 —1]!,v.=[-1 1 3]', and 
v3 =[0 3 1]! also a basis for R*? 


Solution. First, we observe that while the formal definition of a basis refers 
to the basis of a subspace H of a vector space V, since every vector space is a 
subspace of itself, it follows that we can also discuss a basis for a vector space. 

Considering the set 6 = {e;, eo, €3}, we observe that the vectors in this set 
are the columns of the 3 x 3 identity matrix. By the Invertible Matrix Theorem, 
it follows that the set B is linearly independent because I; has a pivot in every 
column. Likewise, the set B spans R? since I; has a pivot in every row. As a 
linearly independent spanning set in R°, B is indeed a basis. 

For the set S whose elements are the columns of the matrix 


1-1 0 
A= 2 1 3 
-l1 31 


we again use the Invertible Matrix Theorem to determine whether or not S is a 
basis for R>. Row-reducing A, it is straightforward to see that A is row equivalent 
to the identity matrix, and therefore is invertible. In particular, A has a pivot in 
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every column and every row, and thus the columns of A are linearly independent 
and span R?. It follows that S is also a basis for R?. 


The basis 6 = {e1, e2, e3} consisting of the columns of the 3 x 3 identity matrix 
is often referred to as the “standard basis of R*.” In addition, by our work in 
example 1.12.3, we can see the role that the Invertible Matrix Theorem plays in 
determining whether a set of vectors in R” is a basis or not. Specifically, since 
we know that it is logically equivalent for the columns of a square matrix A to be 
linearly independent and to be a spanning set for R”, it follows that a matrix A 
is invertible if and only if its columns form a basis for R”. We therefore update 
the Invertible Matrix Theorem with an additional statement as follows. 


Theorem 1.12.1 (Invertible Matrix Theorem) Let A be an n x n matrix. The 
following statements are equivalent: 

a. A is invertible. 

b. The columns of A are linearly independent. 

c. The columns of A span R”. 

d. A has a pivot position in every column. 

e. A has a pivot position in every row. 

f. A is row equivalent to I,,. 

g. For each b € R”, the equation Ax = b has a unique solution. 

h. det(A) 4 0. 


i. The columns of A form a basis for R”. 


Our next example demonstrates how certain families of vectors naturally 
form subspaces of IR” and how vector arithmetic can be used to determine a 
basis for the subspace they form. 


3a+b—c 
4a—5b+c 
a+2b—3c 
a—b 
Show that W is a subspace of R* and determine a basis for this subspace. 


Example 1.12.4 Consider the set W ofall vectors of the form 


Solution. First, we observe that a typical element v of W isa vector of the form 
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Using properties of vector addition and scalar multiplication, we can write 
3 1 —1 
ag)4 ogi lee! 1 
sink |e 2|7°| -3 
1 —l —1 


From this, we observe that W may be viewed as the span of the set S = 
{w1, W2, W3}, where 


ee eee 
yay 


As seen in exercise 19 in section 1.11, the span of any set of vectors in R” 
generates a subspace of R"; it follows that W is a subspace of R*. Moreover, we 
can observe that S = {w), w2, w3} is a linearly independent set since 


3 1 -i1 10 0 
4-5 1 0 10 


i eg | GT 
il, aie 224 000 


Since S both spans the subspace W and is linearly independent, it follows that 
S is a basis for W. 


In example 1.12.4 we used the fact that the span of any set in R” is a subspace 
of R”. This result extends to general vector spaces and is stated formally in the 
following theorem. 


Theorem 1.12.2 In any vector space V, the span of any set of vectors forms a 
subspace of V. 


It is not hard to prove this result. Since the span of a set contains all linear 
combinations of the set, it must contain the zero combination and be closed 
under both vector addition and scalar multiplication. 

One of the reasons that a basis for a subspace is important is that a basis 
tells us the minimum number of vectors needed to fully describe every element 
of the subspace. More specifically, given a basis B for a subspace W, we know 
that we can write every element of W uniquely as a linear combination of the 
elements in the basis. Note that a subspace does not have a unique basis; for 
example, in example 1.12.3, we saw two different bases for R°. 

Furthermore, in R° we have seen that the standard basis (and one example 
of another basis) has three elements. By the Invertible Matrix Theorem, it is 
clear that every basis of R* consists of three vectors since we are required 
to have a set that is both linearly independent and spans R>. Likewise, any 
basis of R” will have n elements. It can be shown that even in vector spaces 
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other than R”, any two bases of a subspace are guaranteed to have the same 
number of elements. Therefore, this number of elements in a basis can be used 
to identify a fundamental property of any subspace: the minimum number of 
elements needed to describe all of the elements in the space. We call this number 
the dimension of the subspace. 


Definition 1.12.4 Given a subspace W in a vector space V and a basis B for 
W, the number of elements in 6 is the dimension of W. Equivalently, if B has k 
elements, we write dim(W) = k. 


Thus we naturally use the language that “IR? is three-dimensional” and 
similarly that “R” has dimension n.” Similarly, we can say dim(P2) = 3 (see 
example 1.12.1), and that the dimension of the vector space of all linear 
combinations of the functions sin t and cost is two (see example 1.12.2). 

In closing, it is worth recalling example 1.6.3 in section 1.6, where we 
considered three sets R, S, and T in R?. R contained two vectors and was 
linearly independent but did not span R’; S$ contained three vectors, was linearly 
independent, and spanned R°?; and T consisted of four vectors, was linearly 
dependent, and spanned R>. Since the set S has both key properties of being 
linearly independent and a spanning set, we can say that the set S is a basis for 
IR3, which further reflects the fact that dim(R*) = 3. 


Exercises 1.12 In the vector space V given in each of exercises 1—7, determine 
a basis for the subspace H and hence state the dimension of H. 


1.V=R°,H=t| O|]:teR 


2. V=P2, H= {at?:aeR} 


4. V =P (the vector space of all polynomials), H = P,, (the subspace of all 
polynomials of degree n or less) 


5. v=RH=|x:ax=owherea=|_f “all 


F a 
_ mp4 Jy. = _ 
6. VRS H= |x: Ax=OwhereA=|_) 5 0 ‘|| 


7. V= Mi H= {Ae Maa: A=|i "|| 
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8. 


9. 
. Is the set S = {[1 2]", [—4 — 8]"} a basis for R2? Justify your answer. 
. Is the set S= {[{1211]',[211 —1]*,[-1131]', [245 1]"} a basis for 


16. 


17: 


18. 
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Determine whether or not the following set S is a basis for R°. If not, is 
some subset of S a basis for R*? Explain. 


1 0 1 2 


Is the set S = {[1 2]?, [2 1]"} a basis for R?? Justify your answer. 


R*? Justify your answer. 


. Is the set S= {[{1211]',[211 —1]",[-1131]', [245 0]"} a basis for 


R*? Justify your answer. 


. Can a set with three vectors be a basis for R*? Why or why not? 
. Can a set with seven vectors be a basis for R°? Why or why not? 


. Not every vector space has a basis with finitely many elements. If there is 


not a finite basis, then we say that the vector space is infinite dimensional. 
Explain why the vector space P of all polynomial functions is an infinite 
dimensional vector space. 


Let V be the vector space V = C[—1, 1] and H the subset defined by 
H= {f €C[—1,1]:f is differentiable} 


Explain why H is an infinite dimensional subspace of V and why we 
cannot explicitly write down the elements in a basis for H. 


Recall from exercises 22 and 23 in section 1.11 that the null space of a 
matrix A is the subspace of all solutions to the equation Ax = 0 and that 
the column space of A is the space spanned by the columns of A. By 
exploring several different examples of matrices A of your choice, discuss 
how the dimensions of the null and column spaces are related to the 
number of pivot columns in the matrix. In particular, explain what you 
can say about the relationship between the sum of the dimensions of the 
null and column spaces and the number of columns in the matrix A. 


Decide whether each of the following sentences is true or false. In every 
case, write one sentence to support your answer. 


(a) Any set of five vectors is a basis for R°. 

(b) If S is a linearly independent set of six vectors in R®, then S is a basis 
for R°. 

(c) If the determinant of a 3 x 3 matrix A is zero, then the columns of A 
form a basis for R>. 

(d) If Ais an n x n matrix whose columns span R”, then the columns of A 
form a basis for R”. 
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1.13 For further study 


1.13.1 Computer graphics: geometry and linear algebra at 
work 


In modern computer graphics, images consisting of sets of pixels are moved 
around the screen through mathematical computations that rely on linear 
algebra. If we focus on two-dimensional objects, there are several basic moves 
that we must be able to perform: translation, rotation, reflection, and dilation. 
In what follows, we explore the role that linear algebra plays in the geometry of 
linear transformations and computer graphics. 


(a) In section 1.8.1 we began to develop an understanding of how matrix 
multiplication can be used to move a two-dimensional image around the 
plane. If you have not already read this section, do so now. 


If we take the perspective that a given point in the plane is stored in the 
vector v, then for any 2 x 2 matrix A, the matrix A moves the vector via 
multiplication to the new location Av. If we have a finite set of points 
(which together constitute an image), we can store the points in a matrix 
M whose columns represent the individual points), and the new image 
which results from multiplication by A is given by AM. 


Consider the triangle with vertices (0, 0), (3, 1), and (2, 2), stored in the 


matrix 
0 3 2 
Me k 1 | 


Choose three different matrices A and compute AM. Then explain why it 
is impossible to use multiplication by a 2 x 2 matrix to translate the 
triangle so that all three of its vertices appear in new locations. 


(b) Due to our discovery in (a) that a simple translation is impossible using 
2 x 2 matrices, we introduce the notion of homogeneous coordinates; 
instead of representing points in the two-dimensional plane as [x y]', we 
move to a plane in three-dimensional space where the third coordinate is 
always 1. That is, instead of [x yy" we use [x y i. 


Consider the matrix A given by 
1 0 
A=|0 1 b (1.13.1) 
0 0 


and the triangle from (a) which can be represented in homogeneous 
coordinates by the matrix 
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Compute AM. What has happened to each vertex of the triangle 
represented by M? Explain in terms of the parameters a and b in A. 


(c) Using a = 2 and b = —1 in (1.13.1) along with the triangle M from above, 
compute AM in order to determine the translation of the triangle 2 units 
in the x-direction and —1 units in the y-direction. Sketch both the original 
triangle and its image under this translation. 


(d) In order to view some more sophisticated graphics, we use Maple in our 
computations that follow. Rather than performing operations on a 
triangle, we will use the syntax 


> with(plots): with(LinearAlgebra) : 
> setoptions(scaling=constrained, axes=boxed, 
tickmarks=[5,5]): 


> X := cos(t)*(1+sin(t))*(14+0.3*cos(8*t))* 
(1+0.1*cos(24*t)): 
> Y := sin(t)*(1+sin(t)) *(1+0.3*cos(8*t) )* 


(14+0.1*cos(24*t)): 
> plot([X,Y,t=0..2*Pi], color=blue, 
thickness=s3) ; 


js) 


which generates a parametric curve whose plot is the leaf shown in 
figure 1.18. Input these commands in Maple, as well as the syntax 


> leaf := plot([X,Y,t=0..2*Pi], color=grey, 
thickness=1): 


to store the image of the original leafin leaf. 


—1.0 0.0 1.0 


Figure 1.18 A Maple leaf. 
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Finally, for a given matrix A of the form 


441 42 443 
A=] a1 2 493 
0 0 1 


anda vector Z=[X Y 1], compute AZ (by hand) to show how AZ 
depends on the entries in A. 


(e) By our work in (c) and (d), if we now let 


10 2 
A=]0 1-1 
00 1 


the product AZ should result in translation of the leaf by the vector 
[2 —1]". To test this, we define the matrix A in Maple by 


> A := <<1,0,0>|<0,1,0>|<2,-1,1>>; 


and compute the coordinates in the new image by 


> Xnew := A[1,1]*X + A[1,2]*Y + A[1,3]*1: 
> Ynew := A[2,1]*X + A[2,2]*Y + A[2,3]*1: 
> imagel := plot([Xnew, Ynew, t=0..2*Pi], 


thickness=3, color=blue): 


The last command above plots the resulting image and stores it in 
imagel1. Display both the original leaf and the new image with the 
command 


> display(leaf, imagel1); 


and show that this results indeed in the translated leaf as shown in 
figure 1.19. 


(f) In section 1.8.1, we learned that a matrix of the form 


R= cos@ —sind 
~ | sin@ ~~ cos@ 


is known as a rotation matrix and, through multiplication, rotates any 
vector by @ radians counterclockwise about the origin. To work with a 
rotation matrix in homogeneous coordinates, we update the matrix as 


follows: 
cosOé —sin@d 0 
R=] sinO  coséd 0 
0 0 1 
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Let us say that we wanted to perform two operations on the leaf. First, we 
wish to translate the leaf as above along the vector [2 — 1]', and then we 
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-l 1 3 


Figure 1.19 The original leaf and its transla- 
tion by [2 — 1]. 


want to rotate the resulting image 2/4 radians clockwise about the origin. 
We can accomplish this through two matrices by computing their 
product, as the following discussion shows. 

From (e), we know that using the matrix 


> Translation := <<1,0,0>|<0,1,0>|<2,-1,1>>; 
leads to the desired translation. Likewise, the matrix 


> Rotation := <<1/sqrt(2),-1/sqrt(2), 
O>|<1/sqrt(2),1/sqrt(2),0>|<0,0,1>>; 


will produce the sought rotation. 
Explain why the matrix 


> A := Rotation.Translation; 
will produce the combined translation and rotation, and plot the resulting 
figure by updating your computations for Xnew and Ynew and using the 
syntax 
> image2 := plot([Xnew, Ynew,t=0..2*Pi], 
thickness=4, color=black) : 


> display(leaf, imagel, image2); 


(g) What is the result of applying the matrix 
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on the leaf? What kind of geometric transformation is performed by this 
matrix? What matrix would keep the height of the leaf constant but stretch 
its width by a factor of 2? 


(h) It can be shown that to reflect an image across a line through the origin 
that forms an angle a with the positive x-axis, the necessary matrix is 


cos 2a sin2a 0 
A= | sin2a —cos2a 0 
0 0 1 


By finding the appropriate value of a, find the matrix that will reflect an 
image across the line y = x and compute and plot the image of the original 
leaf under this reflection. 


(i) Exercises for further practice and investigation: 


1. Find the image of the original leaf under rotation about the origin by 
2m /3 radians, followed by a reflection across the y-axis. 

2. Find the image of the original leaf under rotation about the point 
(—3, 1) by —7:/6 radians. (Hint: To rotate about a point other than the 
origin, first translate that point to the origin, then rotate, then translate 
back.) 

3. Find the image of the original leaf under translation along the vector 
[3 2]', followed by reflection across the line y = x/2. 


1.13.2 Bézier curves 


In what follows’, we explore the use of a specific type of parametric curves, 
called Bézier curves (pronounced “bezzy-eh”), which have a variety of important 
applications. These curves were originally developed by two automobile 
engineers in France in the 1960s, P. Bézier and P. de Casteljau, who were working 
to develop mathematical formulas to graph the smooth, wiggle-free curves that 
formed the shape of a car’s body. Today, Bézier curves find their way into our 
lives every day: they are used to create the letters that appear in typeset fonts. 
The principles that govern these curves involve fundamental mathematics from 
linear algebra and calculus. 


(a) In calculus, we study parametric curves given in the form 
x=f(t), y= g(t), where f and g are each functions of the parameter f. 
Another way to denote this situation is to write 


P(t) = (f(t), g(t)) 


where t belongs to some interval of real numbers. Note that P(t) is 
essentially a vector; the graph of P(t) is the parametric curve traced out by 


7 The material in this project has been adapted from Steven Janke’s chapter “Designer Curves” in 
Applications of Calculus, MAA Notes Number 29, Philip Straffin, Ed. 
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the vector over time. It will be most convenient if we simply write this as 
P(t) = (x(t), y(t)) in what follows. In this problem we begin to consider 
some special formulas for x(t) and y(t). 


To parameterize the line between the points P9(1, 3) and P) (3,7), we can 
think about wanting to make x go from 1 to 3, and y go from 4 to 7. 
Indeed, we want these to occur simultaneously as t goes from 0 to 1. 
Consider the parameterization: 


x=x(t)=1+t(3-1)=t-3+(1—-t):1 


y=y(t)=34+1t(7-3)=t-7+(1—-t)-3 


0<t<l 
Observe that when t = 0, x = 1 and y = 3, and when t = 1, x =3 and 
y=7. 
Show that the curve parameterized by these two equations is indeed the 


line segment between Pp and P;. For instance, you might use algebra to 
eliminate the variable t, thereby deducing a relationship between x and y. 


(b) We can think about the equations for x and y in (a) in a more compact 


manner. Consider the following vector notation to replace the previous 


equations: 
_ | x(t) | } 3 1 
oy = [748 ]=«[3]+a-9[ 3] (1.13.2) 


This is sometimes referred to as taking a convex combination of the points 
(1,3) and (3, 7), because t and 1+ t are both nonnegative and sum to 1. 


Using the above style, write the parametric equations for the line segment 
that passes between the general points Po(xo, yo) and P) (x1, 1). 


(c) An even more concise notation is to simply write P(t) = (1 — t)Po + tP). 


(d) 


We will now use this notation to combine two or more of these 
parameterizations for line segments in a way that constructs curves that 
can be “controlled” in very interesting ways. 


Consider three points, labeled Po, P;, and P2. In the most recent form of 
P(t) given above at (1.13.2), write parameterizations for the two line 
segments from Po to P; and from P| to P2, as pictured below. Call the first 
parameterization P‘)(t) and the second parameterization P)(t). 


In addition, determine the parameterizations P(t) and P)(t) for the 
specific set of points Po(2, 3), Pi (4,7), and P2(7, 1). Show your work, and 
write each out in the expanded form where you have an expression for 
x(t) and another for y(t). 


From the two line-segment parameterizations in (c), we will now create a 
new parametric plot by taking similar combinations of P“)(t) and P®)(t). 
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Pp 


P, 


Figure 1.20 The line  seg- 
ments from Pp to P; and 
P, to Po. 


Consider the function Q(t) defined as follows: 
Q(t) = (1—t)- P(t) + t- P(t) (1.13.3) 


First, substitute in (1.13.3) your expressions for P(t) and P®)(t) from 
(c) that involve the general points Po, P}, and P2. Simplify the result as 
much as possible in order to write the formula for Q in the following form: 


Q(t) = ao(t)Po + a(t) P1 + a2(t) Po 
where ao(t), ai(t), and a2(t) are polynomial functions of t. 


Then, using the specific parameterizations for P(t) and P)(t) for the 
points Po(2, 3), P, (4, 7), and P2(7, 1), determine the parametric equations 
for x(t) and y(t) that make up the function Q(t). For each of these three 
parameterizations (PY, pe), and Q), use Maple to sketch a plot® and 
describe the results in detail. For example, how does Q(t) look in 
comparison to the two line segments? What kind of functions make up the 
components x(t) and y(t) in Q? 

What is true about Q(0) relative to the points Po, P1, and P2? Q(1)? What 


direction is a particle moving along Q(t) headed as t starts out away 
from 0? As t gets near to 1? 


(e) It turns out that we will have even more freedom and control in drawing 
curves if we start with four control points, Po, P), P2, and P3. The 
development here is similar to what was done above, just using a greater 
number of points. 


First, parameterize the segments from Po to P; (with P(t)), P) to P 
(with P®)(t)), and from P, to P; (with P®)(t)). The usual formulas apply 


8 The Maple syntax to plot a parametric curve (f(t), g(t)) on the interval [a, b] is 
> plot([f(t),g(t),t=a..b]);. 
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here; write down the basic form of each P“? (t), 7 =1,2, 3, in terms of the 
various points Pj. 


Then combine, as in (d) above, the parameterizations for the first two 
segments to get a new function Q™; also combine the parameterizations 
for the second two segments to get Q?), These Q parameterizations are 
written as 


Q(t) = (1-8) PY (2) + t- P(t) 


Q(t) = (1— t)- PO (4) + t- PO) 


Finally, combine Q™ and Q™ to get a new parametric function that we 
call B(t) according to the natural formula 


B(t) = (1— ¢)- QM (4) + ¢- QM (A) 
By substituting appropriately for Q(t) and Q?)(t) and then replacing 
these with the appropriate P(t) functions, show that 
B(t) = Po(1—t)? +3P\t(1—t)? +3P)t7(1— t) + P3t?. 


B(t) is called a cubic Bézier curve. 


By finding and using appropriate t values, show that the points Pp and P3 
both lie on the curve given by B(t). 


(f) Write the formulas for x(t) and y(t) that give the parameterizations for 


the cubic Bézier curve that has the four control points Po(2, 2), Pi (5, 10), 
P2(40, 20), and P3(10, 5). Use Maple to plot each of the parametric curves 
given by P(t), peli 33; Q (4), Q(t), and B(t) in the same window. 
Discuss how the various curves combine to form others. 


(g) For the general Bézier curve with control points Po(x9, yo), Pi(x1, V1), 


P2(x2, yz), and P3(x3, y3), derive the equation for the tangent line to the 
curve at the point (xo, yo), and prove that the point (x1, y;) lies on this 
tangent line. (Hint: to determine the slope of the tangent line, use the 
chain rule in the standard way for finding dy/dx for a parametric curve.) 


(h) Laser printers and the program Postscript use Bézier curves to construct 


the fonts that we use to represent letters. For example, a picture of the 
letter g is shown below that reveals the control points and Bézier curves 
required to accomplish this. 


In Maple, use two or more Bézier curves to sketch a reasonable 
representation of the letter S. (You need not try to emulate the thickness of 
the ‘g’ that is shown above.) 


Then, use an appropriate number of Bézier curves to create an 
approximation of the lowercase letter ‘a,’ in the form shown here in 
quotes. State the control points required for the various curves. 
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Figure 1.21 The letter g. 


(i) Discuss the role that vectors and linear combinations play in the 
development of Bézier curves. 


1.13.3 Discrete dynamical systems 


A linear discrete dynamical system is a model that represents changes in a system 
from time k to time k + 1 by the rule 


xD & Ay 


A discrete dynamical system is similar to a Markov chain, but we no longer 
require that the columns of the matrix A sum to 1. A key issue in either scenario 
is the long term behavior of the quantity x”) being modeled. In what follows, we 
explore the role of eigenvalues and eigenvectors in determining this long-term 
behavior and study an important application of these ideas. 


(a) To begin investigating the long-term behavior of the system, we will 
assume that A is an n x n matrix with 7 real linearly independent 
eigenvectors Vj, ..., Vn. Furthermore, assume that the corresponding real 
eigenvalues of A satisfy the relationship 


[Ar] > [Aa] > +++ 2 |An| 


Consider an initial vector x), Explain why there exist constants c,,..., Cn 
such that 


x9) = CLV] + V2 + +++ + OnVn 
and show that 


Ax) = CAV] + CA2V2 +-°° + GpAnVn 


124 Essentials of linear algebra 


Furthermore, show that 
x) = akyO) = qty + orkyy feeef Wy (1.13.4) 


(b) In (1.13.4), divide both sides by ue What can you conclude about 
(Az/A1)* as k > 00? Why can you make similar conclusions about 
(Aj/A1 )* for j =3...n? Hence explain why for large k 


1 k 
(=) AkKx © onal 
Al 


and thus why A‘x is an approximate eigenvector of A 
corresponding to vj. 


(c) In studying a population like spotted owls, mathematical ecologists often 
pay close attention to the various numbers of a species at different stages of 
life. For example, for spotted owls there are three pronounced groupings: 
juveniles (under 1 year), subadults (1 to 2 years old), and adults (2 years 
and older). The owls mate during the latter two stages, breed as adults, and 
can live for up to 20 years. A critical time in the life cycle and survival of 
these owls is when the juvenile leaves the nest to build a home of its own. 
Let the number of spotted owls in year k be represented by the vector 


9 


where j;, is the number of juveniles, s, the number of subadults, and a; the 
number of adults. Using field data, mathematical ecologists have 
determined! that a particular spotted owl population is modeled by the 
discrete dynamical system 


0 0 0.33 
x)= 1918 Oo Of] x 
0 0.71 0.94 


What does this model imply about the percent of juveniles that survive to 
become subadults? About the percent of subadults that survive to become 
adults? About the percent of adults that survive from one year to the next? 
What percent of adults produce juvenile offspring in a given year? 


(d) Assume that in a given region, ecologists have measured the present 
populations as follows: jo = 200, sp = 45, and ap = 725. Use the model 
stated in (c) to determine the population x = [jx sx ag] for 
k =1,...,20. Do you think the spotted owl will become extinct? Give a 


° To read more about the issue of spotted owl survival, see the introduction to chapter 5 of 
David C. Lay’s Linear Algebra and its Applications. 

10 _R. H. Lamberson et al., “A Dynamic Analysis of the Viability of the Northern Spotted Owl in a 
Fragmented Forest Environment,” Conservation Biology 6 (1992), 505-512. 
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convincing argument using not only your computations of the population 
vectors but also the results of (b). 


(e) Say that r is the fraction of juveniles that survive from one year to the next 
(that is, replace 0.18 in the matrix of the model with r) . By experimenting 
with different values of r, determine the minimum fraction of juveniles 
that must survive from one year to the next in order for the spotted owl 
population not to become extinct. How does your answer depend on the 
eigenvalues of the matrix? 


(f) Let A be the m x n matrix of a discrete dynamical system and assume that 
A has n real linearly independent eigenvectors. Let x‘) be an initial vector 
and let (A) denote the maximum absolute value of an eigenvalue of A. 
Show that the following are true: 


(i) If o(A) < 1, then limg_, 49 A*x =0. 
(ii) If o(A) = 1 and A = 1 is the unique eigenvalue having this maximum 
q 8 8 
absolute value, then lim,_, 95 Akx is an eigenvector of A. 
(iii) If p(A) > 1, then there exist choices of x for which || A‘x || grows 
without bound. 
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First-order differential equations 


2.1 Motivating problems 


Differential equations arise naturally in many problems encountered when 
modeling physical phenomena. To begin our study of this subject, we introduce 
two fundamental examples that demonstrate the central role that differential 
equations play in our world. 

In section 1.1, we discussed how the amount of salt present in a system of 
two tanks can be modeled through a system of differential equations. Here, an 
even simpler situation is considered: our goal is to predict the amount of salt 
present in a city’s water reservoir at time f, given a set of determining conditions. 

Suppose that the reservoir is filled to its capacity of 10000 m°, and that 
measurements indicate an initial concentration of salt of Cy = 0.02 g/m?. Note 
that it follows there are Aj = 200 g of salt initially present. As the city draws 
this solution from the reservoir for use, new solution (water with some salt 
concentration) from the local treatment facility flows into the reservoir so that 
the volume of water present in the tank stays constant. Let us assume that the 
concentration of salt in the inflowing solution is 0.01 g/m, and that the rate 
of this inflow is 1000 m3/day. Since the city is also assumed to be drawing 
solution at an equal rate from the reservoir, the outflow also occurs at a rate of 
1000 m?/day. 

We are interested in several key questions. How much salt is in the tank at 
time t? What is the concentration of salt in the water being used by the city at 
time t? What happens to these values over time? 

We will let A(t) denote the amount of salt in the tank at time t. The 
instantaneous rate of change dA/dt of A(t) is given by the difference between 
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the rate at which salt is entering the tank and the rate at which salt is 
leaving. Exploring the given information regarding inflow and outflow, we can 
determine these rates precisely. 
Since solution is entering the reservoir at 1000m3/day containing a 
concentration of 0.01 g/m’, it follows that salt is entering the tank at a rate of 
3 


1000—— - 0.01 
day = _m? day 


For salt leaving the reservoir, the situation is slightly more complicated. Since we 
do not know the exact amount of salt present in the reservoir at time t, we denote 
this by A(t). Assuming that the solution in the reservoir is uniformly mixed, the 
concentration of salt in the outflowing solution is the ratio of the amount A(t) 
of salt to the volume of the tank. That is, the outflowing concentration is 


BUN 
10000 m? 
Since this outflow is occurring at a rate of 1000 m3/day, it follows that salt is 
leaving the tank at a rate of 
m> A(t)g _ A(t) g 
day 10000m3 10 day 
It now follows that the instantaneous rate of change dA/dt of salt in the 


tank in grams per day is given by the difference of the rate of salt entering and 
the rate of salt leaving the tank. Specifically, 


1000 


ie eee (2.1.1) 


Note carefully what this last equation is saying: A(t) is an unknown function, 
but we have an equation that relates this unknown function to its derivative. 
Such an equation is called a differential equation. The solution to this equation 
is a function A(t) that makes the equation true. If we can solve the equation for 
A(t), we then will be able to predict the amount of salt in the tank at any time f. 
Determining such solutions and their long-term behaviors is the main focus of 
this chapter. 

Another important application of differential equations involves pop- 
ulation growth. Consider a population P(t) of animals. As likelihood of 
reproduction depends on the number of animals present, it is natural to assume 
that the rate of change of P(t) is directly proportional to P(t). Phrased in terms 
of the derivative, this assumption means that 


dP 

as kP(t) (2.1.2) 
where k is some positive constant. Observe that (2.1.2) is a differential equation 
involving the function P. It is a standard exercise in calculus to show that 
functions of the form 

P(t) = Poe 
are solutions to (2.1.2). 
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Because the function P(t) = Poe exhibits unbounded growth over time, it 
turns out that this exponential growth model is not realistic beyond a relatively 
short period of time. A related, but more sophisticated, model of population 
growth is the logistic differential equation 


dP | P(t) 


where the constant k is considered the reproductive rate of the population and 
the constant A is the surrounding environment’s carrying capacity. For example, 
if a population had a relative growth rate of k = 0.02 and a carrying capacity of 
A = 100, the population function would satisfy the differential equation 


dP P(t) 
p= 0:02P(t) (: _ ww) 


The logistic model, usually credited to the Dutch mathematician Pierre Verhulst, 
accounts not only for reproductive growth, but also for mortality by considering 
environmental limitations on maximum population. The logistic equation is 
more challenging to solve; we will do so in section 2.7. 

In addition to mixing problems and models of population growth, differ- 
ential equations enjoy widespread applications in other physical phenomena. 
Differential equations are also mathematically interesting in and of themselves, 
and in upcoming sections we will study not only their applications, but also their 
key properties and characteristics to better understand the subject as a whole. 


2.2 Definitions, notation, and terminology 


As we have seen with the examples 


dA A 

—=10- — G21) 

dt 10 

dP _ o.02P 1 . (2.2.2) 

dt 100 ” 
y'+y=0 (2.2.3) 


a differential equation is an equation relating an unknown function to one or 
more of its derivatives. Usually we will suppress the notation “A(t)” and instead 
simply write “A,” as in (2.2.1). We will interchangeably use the notations y’ 
and dy/dt to represent the first derivative; similarly, y” = d?y/dt?. Other books 
sometimes employ the notations y’ = D(y) = y and y” = D?(y) = j. 

A solution of a differential equation is a differentiable function that satisfies 
the equation on some interval (a, b) of values for the independent variable. 
For example, the function y = sint is a solution to (2.2.3) on (—ow, 00) since 
y” =-—sint, and —sint +sint = 0 for all values of t. 

Given any differential equation, we are interested in determining all of its 
solutions. But many, ifnot most, differential equations are difficult or impossible 
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to solve. For example, the equation 
y" ae ty =f 


(which is only a slightly modified version of (2.2.3)) has no solution in terms 
of elementary functions.' In such situations, we may turn to qualitative or 
approximation methods that may enable us to analyze how a solution should 
behave, while perhaps not being able to determine an explicit formula for the 
function. 

Equations (2.2.1), (2.2.2), and (2.2.3) are often called ordinary differential 
equations, in contrast to partial differential equations such as 

a*u a7 u 

ao ye 
where the solution function u(x, y) has two independent variables x and y. Our 
focus will be on ordinary differential equations, as partial differential equations 
are beyond the scope of this text. The order of a differential equation is the order 
of the highest derivative present. For example, (2.2.1) and (2.2.2) are first-order 
differential equations since they only involve first derivatives. Equation (2.2.3) 
is second-order. For now, we limit our attention to first-order equations; higher 
order equations will be discussed in detail in subsequent chapters. 

It is important to note that every student of calculus learns to solve a certain 
class of differential equations through integration. For example, the problem, 
“find a function y whose derivative is te’” can be restated as a differential 
equation. In particular, this problem can be stated as the differential equation 


dy 

ae 
Integrating both sides with respect to t and using integration by parts on the 
right, it follows that 


te! (2.2.4) 


y(t) = te’ -—e' +C 
is a solution for any choice of the constant C. Here we see an important 
fact: differential equations typically have a family of infinitely many solutions. 
Determining all possible members of that family, like determining all solutions 
to systems of linear equations in linear algebra, will be a central component of 
our work. 

Calculus students also know that if we are given one more piece of 
information about the function y along with (2.2.4), it is possible to uniquely 
determine the integration constant, C. For example, had the problem above 
read, “find a function y whose derivative is te’ such that y(0) = 5,” we could 
integrate to find y = te’ — e’ + C, just as we did previously, and then use the 
initial condition y(0) = 5 to see that C must satisfy the equation 


5=0-e°9—e° +C 


! This fact is not obvious. 
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and thus C = 6. When we are given a differential equation of order n along 
with n initial conditions, we say that we are solving an initial-value problem. 
In the given example, y = te‘ — e' + 6 is the solution to the stated initial-value 
problem. 

Based on the example above and our experience in calculus, it is clear that 
integration is an obvious (and often effective) approach to solving differential 
equations of the form 


where f(t) is a given function. If we can integrate f symbolically, then the 
differential equation is solved. Even if f(t) cannot by integrated symbolically 
with respect to t, we can still use techniques like numerical integration to 
successfully attack the problem. The situation grows more complicated when 
we want to solve differential equations that also involve the unknown function 
y, such as 

yy = te” 

dt 
In what follows in this chapter, we seek to classify first-order equations into 
types that can be solved in a straightforward way by symbolic means (often 
involving integration), as well as to develop methods that can be used to generate 
approximate solutions in situations where a symbolic solution is either difficult 
or impossible to attain. Throughout, the general form of the equations we are 
considering will be y’ = f(t, y), where the function f(t, y) represents some 
combination of the independent variable t and the unknown function y. 

It is also important to note that a wide range of first-order initial-value 

problems are guaranteed to have unique solutions. This is stated formally in the 
following theorem, whose proof may be found in more advanced texts. 


Theorem 2.2.1 Consider the initial-value problem given by y’ = f(t, y), 
y(to) = yo. If the function f(t, y) is continuous on a rectangle that includes 
(f, Yo) in its interior and the partial derivative? fy(t,y) is continuous on 
that same rectangle, then there exists an interval containing f on which the 
initial-value problem has a unique solution. 


Often the dependent variable, or unknown function y, in a differential 
equation will model an important quantity in some physical problem: the 
amount of salt in a tank at time ft, the number of members of a population 
at a given time, or the position of a mass attached to a spring. As such, we 
will place particular emphasis on the graph of the solution function in order to 
better understand what the differential equation is telling us about the physical 
situation it models. 


? We often use the abbreviation IVP to stand for the phrase “initial-value problem.” 
3 We typically use the notation f,(t, y) = af /ay. 
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Just as geometry and graphical interpretations shaped our understanding 
of linear algebra in chapter 1, these perspectives will prove extremely helpful in 
our study of differential equations. We begin our explorations of these graphical 
interpretations through the reservoir problem from section 2.1 and the earlier 
example y’ = te’. 

So far in our references to derivatives in the reservoir and population 
models, we have viewed the derivative as measuring the instantaneous rate of 
change of a quantity that is varying. From a more geometric point of view, we 
also know that the derivative of a function measures the slope of the tangent 
line to the function’s graph at a given point. For example, with the differential 
equation 


oh, (2.2.5) 
we can say that if, at some time t, the amount of salt A is A = 20, then dA/dt = 
10 — 20/10 = 8. Thus, if A(t) is a solution to the differential equation, it follows 
that at any time where A(t) = 20, A’(t) = 8. Graphically, this means that at such 
a point, the slope of the tangent line to the curve must be 8. 

Since we are interested in the function A(f) over an interval of t-values, we 
also expect that A(t) will take on a wide range of values. As such, it is natural to 
compute the slope of the tangent line determined by (2.2.5) for a large number 
of different values of A and t. Obviously computers are best suited to such a 
task, and, as we will see in the introduction to Maple commands at the end of 
this section, Maple and other computer algebra systems provide tools for doing 
so. Computing values of dA/dt over a grid of t and A values, we can plot a 
small portion of each corresponding tangent line at the point (t, A), and see the 
resulting slope field (or direction field). The slope field for (2.2.5) is shown in 
figure 2.1. 


A(t) 


t 


10 20 30 40 50 


Figure 2.1 The slope field for 
(2.2.5); the graph of the solution cor- 
responding to an initial condition 
A(0) = 200 is included. 
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Observe that a slope field provides an intuitive way to understand the 
information a first-order differential equation possesses: the slope at each point 
gives the direction of the solution at that point. Indeed, we use arrows instead 
of small lines in order to indicate the flow of the solution as time increases. In 
essence, the slope field is a map that the solution must navigate based on the 
initial point from which the function starts. For example, if we use the initial 
condition A(0) = 200 (as was given in the original example in section 2.1), we 
can start a graph at the point (0, 200) and follow the map. Doing so yields the 
curve shown in figure 2.1. 

Note particularly how we can clearly see the slope of the solution curve 
fitting with the slopes present in the direction field. Moreover, observe that the 
direction field provides an immediate overall sense of how every solution to the 
differential equation behaves: for any solution A(t), A(t) > 100 as t > oo. This 
makes sense physically, too, since the saltwater solution entering the reservoir 
has concentration 0.01 g/m>. Over time, the concentration of solution in the 
reservoir should tend to that level, and with 10000 m? of solution present in the 
reservoir, we expect the amount of salt to approach 100 g. 

Another example of a differential equation’s slope field provides further 
insights. For the differential equation 


dy 
dt 
its slope field for the window —2 < t < 1 and —2 < y < 2 is given in figure 2.2. 

We noted earlier that the general solution to this equation is y = te’ — e' +C. 
Moreover, given any initial condition, we can determine C. For example, if 
y(0) = 1/2, C= 3/2. Likewise, if y(0) = 0, C = 1, and if y(0) = —1, C= 0. If 
we plot the corresponding three functions with the slope field, then (as shown 
in figure 2.2) the three members of the family of all solutions to the original 
differential equation appear as shown. 

In integral calculus, students learn about families of antiderivatives* and 
how two members of such a family differ only by a constant. Here, we see this 
fact graphically in the slope field of figure 2.2, and can add the perspective that 
there exists a family of solutions to a certain differential equation. In upcoming 
sections, we will learn new techniques for how to determine solutions analytically 
in various circumstances, while not losing sight of the fact that every first-order 
differential equation can be interpreted graphically through a direction field. 

Finally, there is an important type of first-order differential equation (DE) 
for which solutions can be determined algebraically. A first-order DE is said to 
be autonomous ifit can be written in the form y’ = f(y). That is, the independent 
variable t is not involved explicitly in f(y). For example, the equation 


te! (2.2.6) 


y=l1-y’ (2.2.7) 


is autonomous. 


4 An antiderivative F of a function f is a function that satisfies F’ = f. 
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= 
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Ab 


Figure 2.2 The slope field for (2.2.6) along 
with three solution functions for the initial 
conditions y(0) = 1/2, y(0) = 0, and 
y(0)=-1. 


In addition, a solution y to a DE is called an equilibrium or constant solution 
if the function y is constant. In (2.2.7), both y= 1 and y = —1 are equilibrium 
solutions to the DE above. Such a solution is stable if all solutions with initial 
conditions y(t) = yo with yo close to the equilibrium solution result in the 
overall solution to the IVP tending toward the equilibrium solution. Otherwise, 
the equilibrium solution is called unstable. 

We close this section with an example regarding an autonomous differential 
equation. 


Example 2.2.1 Consider the differential equation y’ = (y* — 1)(y — 3). 
Determine all equilibrium solutions to the equation, as well as whether or not 
each is stable or unstable. Finally, plot the direction field for the equation and 
include plots of the equilibrium solutions. 


Solution. To find the equilibrium solutions, we assume that y is a constant 
function, and therefore y’ = 0. Solving the algebraic equation 

0=(7 -1)(y-3) 
we find that y = —1, y = 1, and y = 3 are the equilibrium solutions of the 
given DE. 

We can decide the stability of each equilibrium solution by studying the 
sign of y’ near the equilibrium value; note that (y — 3)? is always nonnegative. 
To consider the stability of y = —1, observe that when y < —1, y’ = (y+ 1) 
(y —1)(y — 3)? > 0, since the first two terms are both negative and the third is 
positive. When y > —1 (and y < 1), it follows y’ = (y+ 1)(y— 1)(y— 3)? <0 
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Figure 2.3 The slope field for y’ = (y? —1) 
(y — 3)? along with its three equilibrium 
solutions. 


since the middle term is negative while the other two are positive. Hence, if a 
solution starts just below y = —1, that solution will increase toward —1, whereas 
if a solution starts just above y = —1, it will decrease to —1. This makes the 
equilibrium y = —1 stable. 

These observations are easiest to make visually in the direction field. As seen 
in figure 2.3, the constant solution y = —1 is stable, since any solution with an 
initial condition just above or just below y = —1 will tend to y = —1. However, 
the solution at y = 1 is unstable, since any solution with an initial value just 
above or just below y = 1 will tend away from 1 (and tend toward y = 3 or 
y = —1, respectively). Finally, although solutions just below y = 3 tend to 3, 
any solution that begins just above y = 3 will increase away from that constant 
solution, and hence y = 3 is also unstable.° 


2.2.1 Plotting slope fields using Maple 


Just as our work in linear algebra required the use of Maple’s Linear 
Algebra package, to take advantage of the software’s support for the study 
of differential equations we use the DEtools package, loading it with the 
command 


> with(DEtools): 


> Some authors call a solution such as y = 3 in this example semi-stable, since there is stability on one 
side and instability on the other. 
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To plot the direction field associated with a given differential equation, it 
is convenient to first define the equation itself in Maple. This is accom- 
plished (for the equation from the reservoir problem) through the following 
command: 


> Eql := diff(A(t), t) = 10-1/10*A(t); 


Note that the differential equation of interest is now stored in “Eq1”. The slope 
field may now be generated by the command 


> DEplot(Eql, A(t), t = 0 .. 50, A(t) =O .. 200, 
color = grey, arrows=large) ; 


This command produces the slope field of figure 2.1, but without any particular 
solution satisfying an initial value included. It is important to note that the range 
of t and A(t) values is extremely important. Without a well-chosen window 
selected by the user, the plot Maple generates may not be very insightful. 
For example, if the above command were changed so that the range of A(t) 
values is 0 .. 10, almost no information can be gained from the slope 
field. As such, we will strive to learn to analyze the expected behavior of a 
differential equation from its form so that we can choose windows well in 
related plots; we may often have to experiment and explore to find graphs that 
are useful. 

Finally, if we are interested in one or more related initial-value problems, 
a variation of the DEplot command enables us to sketch the graph of each 
corresponding solution. For example, the command 


> DEplot(Eql, A(t), t = 0 .. 50, A(t) =O .. 200, 
color = grey, arrows=large, [[0,200]]); 


will generate not only the slope field, but also the graph of the solution A(t) 
that satisfies A(0) = 200, as shown in figure 2.1. Additional curves for different 
initial conditions may be plotted by listing the other conditions to be satisfied: 
for example, in the stated command above we could replace [ [0,200] ] with 
[{[0,200], [0,100], [0,0]] to include the plots of the three solution 
curves that respectively satisfy A(0) = 200, A(0) = 100, and A(0) = 0. 


Exercises 2.2 
1. Consider the differential equation y” = 4y. 


(a) What is the order of this equation? 

(b) Show via substitution that the function y = e”! is a solution to this 
equation. 

(c) Are there any other functions of the form y = e” (r 4 2) that are also 
solutions to the equation? If so, which? Justify your answer. 


Definitions, notation, and terminology 137 


2. For a ball thrown straight up from an initial height s(0) = 4 meters at an 
initial velocity of s’(0) = 10 m/s, we know that after being thrown, the only 
force acting on the ball is gravity, provided we neglect air resistance. 
Knowing that acceleration due to gravity is constant at —9.81 m/s’, it 
follows that s(t) = —9.81. Use the given information to determine s(t), 
the function that tells us the height of the ball at time t. Then determine 
the maximum height the ball reaches, as well as the time the ball lands. 


3. In the differential equation dA/dt = 10 — A/10 from the reservoir 
problem, explain why the function A(t) = 100 is an equilibrium solution 
to the equation. Is it stable or unstable? Why? 


4. Consider the logistic differential equation 


dP ( P ) 
— =0.02P{ 1— — 
dt 100 


Use Maple to plot the direction field for this equation. Print the output 
and, by hand, sketch the solutions that correspond to the initial conditions 
P(0) = 10, P(0) = 75, and P(0) = 125. What is the long-term behavior of 
every solution P(t) for which P(0) > 0? Are there any constant (or 
equilibrium) solutions to the equation? Explain what these observations 
tell you about the behavior of the population being modeled. 


5. For the logistic differential equation 


dP ( =) 
— =0.001P{ 1— — 
dt 25 


how should the direction field appear? Use the constant/equilibrium 
solutions to the equation as well as the long-term behavior of the 
population to help you sketch, by hand, the direction field for this DE. 


6. By constructing tangent lines over a grid with at least sixteen vertices, 
sketch a direction field by hand for each of the following differential 
equations. 


(a) y'=1-y 
(b) y’ = 5(t—-y) 
(c) y= 4(t+y) 
(d)y=1-t 


7. Without using Maple to plot direction fields, match each of the following 
differential equations with its corresponding direction field. Write at least 
one sentence to explain the reasoning behind each of your choices. 


dy dy _ dy dy 
(a) y-t (b)—=y =e d)a= 


dt dt Y 
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In exercises 8-15, use integration to find a family of solutions for the given 
differential equation. 


8. 
9. 


10. 


11. 
12. 


13. 


14, 


13: 


y=t?42 

y' =t+cost 

Yaa 
té+1 

y' =t? +2 

y" =5t 

y’ =tsint 

; 1 

¥* P45t+6 

yl =te" 
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In exercises 16-23, solve each of the following initial-value problems. 
16. y/=t?7+2, y(1)=4 
17. y'=t+cost, y(r/2)=1 
_ t 

r+1 
9. y=P4+2, y=4y(I)=-2 
20.y”=5t, y(-1)=3,y'(-1)=—-1, y"(-1) =0 
21.y’=tsint, y(0)=2 

1 


io NS 
Y= Fase 7 


18. y’ y(0) =3 


23.y=te, y(0)=—1 


24. For an n" order IVP of the form y\ = f(t), how many initial conditions 
are needed in order to uniquely determine the solution y(t)? Explain. 


For each of the autonomous differential equations given in exercises 25-29, 
algebraically determine all equilibrium solutions to the DE. In addition, plot 
an appropriate direction field and use it to classify each equilibrium solution as 
stable or unstable. 


25, =3—2y 

26. y’ =—y* —5y—6 
27. y'=y-y° 

28. y’ =e V(1+y’) 
29, y' = (y—1)(y —3)? 


2.3 Linear first-order differential equations 


Some classes of differential equations can usually be solved by certain standard 
techniques. In this section, we consider the class of linear first-order differential 
equations and develop an approach for solving any such equation. Since any 
first-order DE is an equation that involves the functions y and y’, it is natural 
for us to consider the different ways in which y and y’ may be combined. For 
example, the equations 


yy =e' (2.3.1) 


2ty+y’ sint = cost (2.3.2) 
y't+siny=2 (2.3.3) 
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are all first-order DEs. Recall that in section 1.12 we discussed linear combina- 
tions of generalized vectors. Here we can view y and y’ as functions that belong 
to a vector space, and thus think about whether a certain combination of y and 
y’ is a linear combination or not. We say that any differential equation of the 
form 


ay(t)y’ + ao(t)y = b(t) (2.3.4) 
isa linear first-order differential equation, since a linear combination of y and y’ is 


being formed. Any other first-order differential equation is said to be nonlinear. 
If we stipulate that a,(t) 4 0, we can divide through by a(t) and hence write 


y + p(t)y = f(t) (2.3.5) 


as the standard form for a linear first-order equation. We call f(t) the 
forcing function. Above, note that (2.3.1) and (2.3.3) are nonlinear equations, 
while (2.3.2) is linear. 

The simplest linear first-order differential equations are those for which the 
forcing function is zero. We naturally call the equation 


y' + p(t)y =0 (2.3.6) 


a homogeneous linear first-order DE. We consider a particular example that 
shows how every such homogeneous DE may be solved. 


Example 2.3.1 Solve the differential equation y’ + (1+3t*)y =0. In addition, 
solve the initial-value problem that is given by the same DE and the initial 
condition y(0) = 4. 


Solution. We will use integration to solve for y. Rearranging the given 
equation, we observe that y’ = —(1+3t7)y. Dividing both sides by y, we find 
that 

/ 

¥ (1439?) 

y 
Keeping in mind the fact that y and y’ are each unknown functions of t, we 
integrate both sides of the previous equation with respect to f: 


y’ 
[eam [asa 
y 


We recognize from the chain rule that the left-hand side is In y. Thus, integrating 
the polynomial in t on the right yields 


Iny=-t—-P4+C 


We note that while an arbitrary constant arises on each side of the equation when 
integrating, it suffices to simply include one constant on the right. Finally, we 
solve for y using properties of the natural logarithm and exponential functions 
to find that 

ace as 


y=e e 
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Since C is a constant, so is e©, and thus we write 
a ee 
y = Ke '— 


Observe that we have found an entire family of functions that solve the original 
differential equation: regardless of the constant K, the above function y is a 
solution. If we consider the stated initial-value problem and apply the given 
initial condition y(0) = 4, we immediately see that K = 4, and the solution to 
the initial-value problem is 

-t- 


y =4e 


The solution method in example 2.3.1 can be generalized to apply to any 
homogeneous linear first-order DE. Using the notation p(t) to replace the 
function 1 + 3t?, which is the coefficient of y, the same steps above may be used 
to find the solution to the standard homogeneous linear first-order differential 
equation. We state this result in the following theorem. 


Theorem 2.3.1 For any homogeneous linear first-order differential equation 
of the form 


y' +p(t)y =0, 


the general solution is y = Ke~?“), where P is any antiderivative of p. Moreover, 
for the initial condition y(t) = yo, if p(t) is continuous on an interval containing 
fo, then the solution to the corresponding initial-value problem is unique. 


The uniqueness of the solution to the initial-value problem follows from 
theorem 2.2.1. But perhaps the most important lesson to learn from this result is 
that a homogeneous linear first-order DE can always be solved. This is analogous 
to our experience with homogeneous linear systems of algebraic equations in 
chapter 1. In particular, note that by taking K = 0, the zero function (y = 0) 
is always a solution to y’ + p(t)y = 0; in addition, the homogeneous linear 
first-order DE has infinitely many solutions. This is very similar to how, for a 
given matrix A, the homogeneous equation Ax = 0 always has the zero vector 
as a solution and, in the case where A is singular, Ax = 0 has infinitely many 
solutions. 

Having now completely addressed the case of a homogeneous linear first- 
order DE, we turn to the nonhomogeneous case. In particular, we are interested 
in solving the equation 


y+ pty =f (2) (2.3.7) 

where f(t) is not identical to zero. Recalling the product rule from calculus, 
Bt) y= vey ae ey (2.3.8) 
we observe that the left-hand side of (2.3.7), y’ + p(t)y, looks similar to the 
right-hand side of (2.3.8). If we multiply both sides of (2.3.7) by an unknown 
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function v(t), we have 
v(t)y’ + v(t)p(d)y = v(t) f(t) (2.3.9) 


Next, we observe that if v(t) is a function such that v’(t) = v(t)p(t), then it 
follows from the product rule that (2.3.9) has the form 


d 
a [v(t)y] = v(t)f(t) (2.3.10) 


We assume temporarily that such a function v(t) exists; we will proceed to 
discuss more about v(t) shortly. Integrating both sides of (2.3.10), we now 
see that 


eee (2.3.11) 

To solve for y, we divide both sides by v(t), yielding 
f(t) d 2.3.12 
= [ unpieyae (2.3.12) 


Prior to (2.3.10), we stipulated a condition on v that enabled us to proceed. 
In particular, we noted that “if v(t) is a function such that v’(t) = v(t)p(t),” then 
we could find a solution in terms of v. Observe that the differential equation v 
satisfies is, in fact, a homogeneous linear first-order equation itself (v’ — p(t)v = 
0), and therefore its solution is 

v(t)= KeP(), 

where P(t) = f p(t) dt. Since we only need one such nonzero function v to 
proceed, we set K = 1. From this and our conclusion in (2.3.12), we have 
determined that 


j=O | POE dt (2.3.13) 


where P(t) = f p(t) dt. The function v(t) = e” (‘) is usually called an integrating 
factor. 

We next consider two examples of nonhomogeneous linear first-order 
differential equations and apply the method we just derived to solve them. 


Example 2.3.2 Solve the differential equation y’ + 2y = 4. 


Solution. In this equation, p(t) = 2, and therefore P(t) = 2t. From (2.3.13), 
it follows that 


yle)= oP f cP Fat 


= aa | e-' .4dt 


=e 7! (2e*' + C) (2.3.14) 
=I Ca (2.3.15) 
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There are several important observations to make from our work in 
example 2.3.2. First, the parentheses at (2.3.14) are essential. Without them, 
e~~' is not multiplied by the entire antiderivative, and the function y would no 
longer be a solution to the given DE. 

A second is that if we had instead solved the corresponding homogeneous 
differential equation y’ + 2y = 0, we would have found the so-called com- 
plementary solution y, = Ce~*'. Moreover, by observing that y’ = 4—2y = 
2(2 — y), if we consider the function yp) = 2, it is apparent that yp is a 
solution to the nonhomogeneous equation y’ + 2y = 4. In addition, if we 
omit the constant of integration C in (2.3.14), it follows that the method 
derived in (2.3.13) can be viewed as producing a so-called particular solution 
Yp that is a solution to the given nonhomogeneous linear first-order differential 
equation. 

Thus we see that the method derived in (2.3.13) and implemented to 
find (2.3.15) ultimately expresses the solution to the original nonhomogeneous 
linear first-order DE in the form 


Y=VptVh 
where yp is a particular solution to the nonhomogeneous equation, while yp is 
the complementary solution, the solution to the corresponding homogeneous 
equation. 

This situation reminds us of one way to view the general solution to a system 
of linear equations given by Ax = b, where in (1.5.1) in section 1.5 we found 
that x = xp +p. A further discussion of this property of linear first-order DEs 
will occur in theorem 2.3.2 to close the current section. Before doing so, we 
consider another example. 


Example 2.3.3. Solve the nonhomogeneous first-order linear differential 
equation 
y’ +ytant=cost 

In addition, solve the initial-value problem (IVP) that is given by the same DE 
and the initial condition y(z/3) = 1. 
Solution. We first determine the integrating factor v(t). Since p(t) = tant, it 
follows that 

P(t)= / tan t dt = —In(cost) 


Thus, v(t) = e~!"°S), Applying the integrating factor and using properties of 
exponential and logarithmic functions, we now observe that 


y= eln(cos ° f cost: e— In(cost) dt 


1 
=cost f cost at 
cost 
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=cost f tat 


=cost(t+C) 


Thus, the general solution to the given differential equation is y = tcost + 
Ccost. 


To solve the corresponding IVP with the condition that y(z/3) = 1, it follows 
that 1 = 7/3-1/2+C-1/2, so that C = 2 — 2/3. The solution is 


y =tcost+(2—7/3) cost 


As in example 2.3.2, we note that the solution y = tcost + Ccost in 
example 2.3.3 is of the form y = yp + yp, where y, = Ccost can easily be 
checked to be the solution to the corresponding homogeneous equation. 

Two important results can now be stated in general. The first is a 
formal statement of our derivation in (2.3.12) that shows how we can use 
an integrating factor to solve any nonhomogeneous linear first-order DE. The 
second demonstrates that for any of these types of DEs, if yp is a particular 
solution to the nonhomogeneous DE and y;, is a complementary solution to 
the corresponding homogeneous DE, then y = yp + y;, is also a solution to the 
nonhomogeneous DE. 


Theorem 2.3.2 For any nonhomogeneous linear first-order differential 
equation of the form 


y + p(t)y =f (t), 


the general solution is 
y= PO f eM F() dt 


where P(t) = J p(t) dt. Moreover, for the initial condition y(t) = yo, if p(t) 
and f(t) are continuous on an interval containing fp, then the solution to the 
corresponding initial-value problem is unique. 


The proof of the first part of theorem 2.3.2 is given above in the discussion 
of (2.3.7)—-(2.3.12). The uniqueness of the solution to the IVP follows from 
theorem 2.2.1. 

Finally, we observe that given a nonhomogeneous linear first-order 
differential equation y’ + p(t)y = f(t) and a particular solution yp (so 
Vp + p(t)yp = f(t)) and complementary solution y;, to the corresponding 
homogeneous equation (%, + p(t)y, = 0), it follows that 


(yp + yn)’ + P(E)(Yp + Yb) =¥p +¥_ + P(t) Yp + P(E Yn 
= (y, + p(t)yp) + (yy + PCO yn) 
=f(t)+0 
= f(t) 
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Therefore, yp + y;, is also a solution to the nonhomogeneous DE. Formally, 
we have the following result. 


Theorem 2.3.3 For any nonhomogeneous linear first-order differential 
equation, 


y+ p(t)y = f(t) 


if yp isa particular solution to the nonhomogeneous equation and yy is a solution 
to the corresponding homogeneous equation, then y = yp + yj, is also a solution 
to the nonhomogeneous equation. 


Exercises 2.3 
In exercises 1-6, classify each equation as linear or nonlinear. Do not attempt 
to solve the equations. 


ly +7y=e! 

2. cos ty’ + sin ty = t? 
3. cosy’+siny = ft? 
4.t/+Py=0? 
5, yy? =3t 
6.1=y/y’ 

In exercises 7-13, solve each of the given homogeneous linear first-order DEs. 


7.y¥ +y=0 


8. y'+2y=0 
9. y'+ty=0 
jo ok 
10. y+=y =0 


11. y’ =—ycott 
12. (14 t?)y/ + 2ty =0 


3.y=- 
y 100-1? 


In exercises 14—20, solve each of the given nonhomogeneous linear first-order 
DEs. 


14.y/+y=2 
15. y' +2y=2t 
16. y' + ty = 10t 
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/ 2 t 
17. y aur e col 
18. y =—(y—1)cott 
19. (1+ t*)y +2ty =2t 


20. y’ = 0.03 — wane” 
In exercises 21—27, solve each of the given initial-value problems. 
21.7 +y=2, y(0)=3 

22. y/+2y=2t, y(1)=0 


23.y'+ty=10t, y(0)=5 
2 
24. yi to y=el, y1)=4,t>0 


25.y =—-(y—l1)cott, y(r/2)=1, 0O<t<az 
26. (1+ 27)y’+2ty=2t, y(0)=1 


27. y’ = 0.03 — y, y(0)=1 


100—t 
In exercises 28-33, plot a slope field in an appropriate window of t and y values 
for each of the given DEs. In addition, in the same window, plot the solution 
to each given IVP. Compare each graph to the solutions you found in the 
corresponding exercises 21-27. 


28.y'+y=2, y(0)=3 
29. y/+2y=2t, y(1)=0 
30. y’+ty=10t, y(0)=5 


2 
31. y+ oy=el, y()=4,t>0 
32. y'=—(y—l)cott, y(z/2)=1,0<t<z 


33. (1+ t*)y’+2ty=2t, y(0)=1 


34. With matrix multiplication, we noted that for any matrix A and 
appropriately sized vectors x and y, A(x +y) = Ax + Ay. In addition, 
for any constant c, A(cx) = cAx. We called these properties “the linearity 
of matrix multiplication.” In calculus, we learn that the derivative 
operator, D, satisfies similar properties of linearity. In particular, if f and 
g are differentiable functions and c is any constant, what can you say 
about D(f + g) and D(cf)? (Recall that D(f) is alternate notation 
for f’.) 
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2.4 Applications of linear first-order differential 
equations 


A large number of important physical situations can be modeled by linear 
first-order differential equations. In this section we introduce several such 
applications through examples and explore further scenarios in the exercises. 


2.4.1 Mixing problems 


Recall that in section 2.1, we encountered a problem where a saltwater solution 
was entering and exiting a city’s water reservoir. Specifically, in (2.1.1) we 
encountered the DE 


This equation, rewritten in the form 
1 
A’+—A=10 
10 


is a linear first-order DE that we now can easily solve. With p(t) = 1/10, the 
integrating factor is v(t) = e’/1°, and therefore 


A= ay e/10 .10dt (2.4.1) 
= e 1/101 Q0e'/19 4. C) (2.4.2) 
= 100+ Ce*/10 (2.4.3) 


From this result, we can also confirm our previous observation that as t > oo, 
A(t) — 100, for any solution A(t) to the differential equation. Moreover, if we 
consider the initial condition A(0) = 200 stated along with the original problem 
in section 2.1, it follows that 


A(t) = 100+ 100e~*/!° 


Certainly we can consider a wide range of variations on this mixing 
problem by changing concentrations, flow rates, and tank volumes. In every 
such scenario, the most important thing to keep in mind is that the rate of 
change of salt (or whatever quantity is under consideration) is the difference 
between the rate of salt entering and the rate exiting. Furthermore, an analysis 
of units is often very helpful. We consider one more example to demonstrate 
what can occur when the entering and exiting solutions are flowing at different 
rates. 


Example 2.4.1 Consider a tank in which 1 g of chlorine is initially present in 
100 m? of a solution of water and chlorine. A chlorine solution concentrated at 
0.03 g/m? flows into the tank at a rate of 1 m°/min, while the uniformly mixed 
solution exits the tank at 2 m?/min. At what time is the maximum amount of 
chlorine present in the tank, and how much is present? 
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Solution. To answer the questions posed, we set up and solve an IVP. We let 
A(t) denote the amount of chlorine in the tank (in grams) at time f (in minutes). 
We note from the inflow that the rate at which chlorine is entering the tank is 
given by 

3 


‘ m g 

rate in = 1 —— -0.03 a (2.4.4) 
min m 

For the exiting flow, we must compute the concentration of chlorine present in 
the solution leaving the tank. This concentration is given by the ratio of amount 
present in grams to the total volume of solution in the tank at time ft. In this 
problem, note that the volume is changing as a function of time. In particular, 
since solution enters at 1 m*/min and exits at 2 m/min, it follows that the 
volume V(t) of solution present in the tank is decreasing at a rate of 1 m>/min. 
With 100 m? initially present, we observe that V(t) = 100 — t is the volume of 
solution in the tank at time ¢. Thus, the concentration of chlorine in the solution 
exiting the tank at time f is given by 


m> A(t) g _2-A(t) g 


: = 2.4.5 
min V(t)m> 100—¢f min ( ) 


rate out = 2 


It follows from (2.4.4) and (2.4.5) that the overall instantaneous rate of change 
of chlorine in the tank with respect to time is 


dA . 2A 
— = rate in— rate out = 0.03 — 
dt 100 —t 


Note that we also have the initial condition A(0) = 1. Rearranging the differential 
equation, we see that we must solve the nonhomogeneous linear first-order 
equation 


A+ A=0.03 (2.4.6) 


100 —t 
Applying the approach discussed in section 2.3, followed by the initial condition, 
it can be shown that the solution to (2.4.6) is 


A(t) = 3 — 0.03t — 0.0002(100 — t)? 


From the quadratic nature of this solution, as well as from the direction field 
shown in figure 2.4, we can see that this function has a maximum value. It 
is a straightforward exercise to show that this maximum of A(t) occurs when 
t = 25 min and that the maximum is A = 1.125g. 


2.4.2 Exponential growth and decay 


A radioactive substance emits particles; in doing so, the substance decreases its 
mass. This process is known as radioactive decay. For example, the radioactive 
isotope carbon-14 emits particles and loses half its mass over a period of 
5730 years. For any such isotope, the instantaneous rate of decay is proportional 
to the mass of the substance present at that instant. Thus, assuming an initial 
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A(t) 
2.0 


1.0 


TIT ot 
7 100 
Figure 2.4 Direction field for 
(2.4.6) with solution corresponding 
to the initial condition A(0) = 1. 


mass Mp is present, it follows that the mass M(t) of the substance at time t must 
satisfy the initial-value problem 


M’=—kM, M(0)=Mo (2.4.7) 


for some positive constant k. Note that the minus sign is present in (2.4.7) since 
the mass M(t) is decreasing. It follows from our work with homogeneous linear 
first-order DEs in section 2.3 that the solution to this equation is 


M(t) = Moe (2.4.8) 


Similarly, experiments show that a population with zero death rate (e.g., a colony 
of bacteria with sufficient food and no predators) grows at a rate proportional 
to the size of the population at time ft. In particular, if P(t) is the population 
present at time ¢ and Pp is the initial population, then P satisfies the initial- 
value problem P’ = kP, P(0) = Po, for some positive constant k. Here, it 
follows that 
P(t) = Poe™ (2.4.9) 
Problems involving radioactive decay and exponential population growth 
are very similar and should be familiar to students from past courses in calculus 
and precalculus. We include one example here for review and several more in 
the exercises at the end of the section. 


Example 2.4.2 A radioactive isotope initially has 40 g of mass. After 10 days 
of radioactive decay, its mass is 39.7 g. What is the isotope’s half-life? At what 
time ¢ will 1 g remain? 


Solution. Because the isotope decays radioactively, we know that its mass 
M(t) must have the form M(t) = Moe~**. To answer the questions posed, we 
must first determine the constant k. In the given problem, we know that Mo = 40 
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and that M(10) = 39.7. It follows that 
39.7 = 40e 10 


Dividing both sides of the equation by 40, taking natural logs, and solving for k, 
we find that 


To compute the half life, we now solve the equation 
M 
et) = Mo e Kt 
2 


for t. In particular, we have 
39.7 


20 = 40e75 8(-r) 


Dividing by 40 and taking natural logs, 
1 1 We 
In =—ln 2 t 
2 10 40 


ain (Fe) 


SO 


Thus the half-life of the isotope is approximately 921 days. 
Finally, to determine when | g of the substance will remain, we simply solve 
the equation 


1 = 40e10 n( 75) t 


Doing so shows that t ~ 4900 days. 


2.4.3 Newton’s law of Cooling 


Suppose that T(t) is the temperature of a body immersed in a cooler 
surrounding medium such as air or water. Sir Isaac Newton postulated (and 
experiments confirm) that the body will lose heat at a rate proportional to 
the difference between its present temperature and the temperature of its 
surroundings. If we assume that the temperature of the surrounding medium is 
constant, say T,, and that the warmer body’s initial temperature is T(0) = To, 
then Newton’s law of Cooling can be expressed through the initial-value problem 


T =—-k(T—Tm), T(0) = To (2.4.10) 


Written in the standard form of a nonhomogeneous linear first-order DE, we 
find that T satisfies the IVP 


T’ +kT =kTm, T(0) = To (2.4.11) 
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Solving this problem in the standard way reveals that the temperature of the 
cooling body must satisfy 

T(t) = (To — Tm)e™ + Tin (2.4.12) 
We consider an example with some particular details given in order to analyze 
the behavior of the temperature function. 


Example 2.4.3. A can of soda at room temperature 70°F is placed in a 
refrigerator that maintains a constant temperature of 40°F. After 1 hour in 
the refrigerator, the temperature of the soda is 58°F. At what time will the soda’s 
temperature be 41°F? 


Solution. Let T(t) denote the temperature of the soda at time t in degrees 
F; note that To = 70. Since the surrounding temperature is 40, T satisfies the 
initial-value problem 


T’ =—k(T—40), T(0)=70 
and therefore by (2.4.12) T has the form 
T(t) =30e-* +40 


In particular, note that the temperature is decreasing exponentially as time 
increases and tending towards 40°F, the temperature of the refrigerator, as 
t—> oo. 

To determine the constant k, we use the additional given information that 
T(1) = 58, and therefore 

58 = 30e * +40 

It follows that e~* = 3/5, and thus k = In(5/3). To now answer the original 
question, we solve the equation 


Al = 30e726/9)# 4 40 
and find that t = In(30)/1n(5/3) © 6.658 h. 


Exercises 2.4 


1. A population of bacteria is growing at a rate proportional to the number of 
cells present at time f. If initially 100 million cells are present and after 
6 hours 300 million cells are present, what is the doubling time of the 
population? At what time will 100 billion cells be present? 


2. The half-life of a radioactive element is 2000 years. What percentage of its 
original mass is left after 10 000 years? After 11 000 years? 


3. The evaporation rate of moisture from a sheet hung on a clothesline is 
proportional to the sheet’s moisture content. If one half of the moisture 
evaporates in the first 30 min, how long will it take for 95 percent of the 
moisture to evaporate? 
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4. A population of 200 million people is observed to grow at a rate 
proportional to the population present and to be increasing at a rate 
of 2 percent per year. How long will it take for the population to triple? 


5. Ina certain lake, wildlife biologists determine that the walleye population 
is growing very slowly. In particular, they conclude that the population 
growth is modeled by the differential equation P’ = 0.002P, where P is 
measured in thousands of walleye, and time t is measured in years. The 
biologists estimate that the initial population of walleye in the lake is 
100 000 fish. To enhance the fishery, the department of conservation 
begins planting walleye fingerlings in the lake at a rate of 5000 walleye per 
year. 


(a) Write an IVP that the population P(t) of walleye in the lake in year t 
will satisfy under the assumption that walleye are being added to the 
lake at a rate of 5000 fish per year. 

(b) Solve the IVP stated in (a). 

(c) In 20 years, how many more walleye will be in the lake than if the 
biologists had not planted any fish? 


6. Solve the IVP A’ = 0.03 — 2/(100 — t) A, A(0) = 1, in order to verify the 
stated solution in example 2.4.1. 


7. Brine (saltwater) is entering a 25 m? tank at flow rate of 0.25 m?/min and 
at a concentration of 6 g/m°. The uniformly mixed solution exits the tank 
at a rate of 0.25 m?/min. Assume that initially there are 15 m> of solution 
in the tank at a concentration of 3 g/m?. 


(a) State an IVP that is satisfied by A(t), the amount of salt in grams in the 
tank at time f. 

(b) What will happen to the amount of salt in the tank as t > oo? Why? 

(c) Plot a direction field for the IVP stated in (a), including a plot of the 
solution. 

(d) At exactly what time will there be 75 g of salt present in the tank? 


8. Brine is entering a 25-m*? tank at flow rate of 0.5 m°/min and at a 
concentration of 6 g/m>. The uniformly mixed solution exits the tank at a 
rate of 0.25 m?/min. Assume that initially there are 5 m? of solution in the 
tank at a concentration of 25 g/m?. 


(a) State an IVP that is satisfied by the amount of salt A(t) in grams in the 
tank at time f. 

(b) Solve the IVP stated in (a). For what values of t is this problem valid? 
Why? 

(c) At exactly what time will the least amount of salt be present in the tank? 
How much salt will there be at that time? 

(d) Plot a direction field for the IVP stated in (a), including a plot of the 
solution. Discuss why this direction field and the solution make sense 
in the physical context of the problem. 


9. 


10. 


1 


— 


12. 


13. 


14, 


15. 
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A body of water is polluted with mercury. The lake has a volume of 200 
million cubic meters and mercury is present in a concentration of 5 grams 
per million cubic meters. Health officials state that any level above 1 g per 
million cubic meters is considered unsafe. If water unpolluted by mercury 
flows into the lake at a rate of 0.5 million cubic meters per day, and 
uniformly mixed lake water flows out of the lake at the same rate, how 
long will it take for the lake to reach a mercury concentration that is 
considered safe? 


An average person takes eighteen breaths per minute and each breath 
exhales 0.0016 m? of air that contains 4 percent more carbon dioxide 
(CO 2) than was inhaled. At the start of a seminar containing 300 
participants, the room air contains 0.4 percent CO2. The ventilation 
system delivers 10 m? of fresh air per minute to the room whose volume 
is 1500 m?. Find an expression for the concentration level of CO) in the 
room as a function of time; assume that air is leaving the room at the 
same rate that it enters. 


. Solve the general Newton’s law of Cooling IVP T’ = —k(T — Tm), 


T (0) = To in order to verify the solution stated in (2.4.12). 


A potato at room temperature of 72°F is placed in an oven set at 350°F. 
After 30 min, the potato’s temperature is 105°F. At what time will the 
potato reach a temperature of 165°F? 


An object at a temperature of 80°C is placed in a refrigerator maintained 
at 5°C. If the temperature of the object is 75°C at 20 min after it is placed 
in the refrigerator, determine the time (in hours) the object will 

reach 10°C. 


An object at a temperature of 9°C is placed in a refrigerator that is initially 

at 5°C. At the same time the object is placed in the refrigerator, the 

refrigerator’s thermostat is adjusted in order to raise the temperature 

inside from 5°C to 10°C; the function that governs the temperature of the 
10 


refrigerator is R(t) = 14 ene" 


(a) Using the refrigerator’s temperature constant k from exercise 13, 
modify Newton’s law of Cooling appropriately to state an IVP whose 
solution is the temperature of the object. 

(b) Plot a direction field for the IVP from (a) and sketch an approximate 
solution to the IVP. 

(c) Discuss the qualitative behavior of the solution to the IVP. Estimate the 
minimum temperature the object achieves. 


On a cold, winter evening with an outdoor temperature of 4°F, a home’s 
furnace fails at 10 pm. At the time of the furnace failure, the indoor 
temperature was 68°F. At 2 am, the indoor temperature was 60°F. 
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Assuming the outside temperature remains constant, at what time will the 
homeowner have to begin to worry about pipes freezing due to an indoor 
temperature below 32°F? 


2.5 Nonlinear first-order differential equations 


So far in our work with differential equations, we have seen that linear first- 
order differential equations have many interesting properties. One is that any 
IVP that corresponds to a linear first-order DE (with reasonably well-behaved 
functions p(t) and f(tf)) is guaranteed to have a unique solution. In addition, 
through our development of integrating factors, we have a method by which we 
can always (at least in theory) determine a solution for the differential equation. 

Any differential equation that is not linear is called nonlinear. Thus, 
nonlinear differential equations constitute every other type of equation we 
can conceive. Unfortunately, nonlinear equations are (in general) far more 
difficult to solve than linear ones. We will limit ourselves in this section to 
considering a few relatively common special cases of nonlinear first-order 
differential equations that can be solved analytically. In section 2.6, we will 
consider qualitative and approximation techniques that enable us to gain 
valuable information from a nonlinear initial-value problem, even in the event 
that we cannot solve it explicitly. 


2.5.1 Separable equations 


In example 2.3.1 in section 2.3, we solved the differential equation y’ = —(1+ 
t?)y. While this equation is linear, our method provides insight into how to 
approach a class of nonlinear equations whose structure is similar. We begin by 
considering a slightly modified example. 


Example 2.5.1 Solve the nonlinear first-order differential equation 


y=-(1+t*)y’ (2.5.1) 


Solution. Following our approach in example 2.3.1, we can separate the 
variables y and t¢ algebraically to arrive at the equation 


dy 
-2 2 
ere eens 
Yh 
Integrating both sides of this equation with respect to ft, 
d 
foo? a= f (1-3) (2.5.2) 


The left-hand side may be simplified to [ y~? dy. Thus, evaluating each integral 
in (2.5.2), we find that 


-y'! =—Pp— PEC (2.5.3) 
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We note again that since an arbitrary constant of integration arises on each side, 
it suffices to include just one. It is essential here to observe that by successfully 
integrating, we have removed the presence of y’ in the equation, and now have 
only an algebraic, rather than differential, equation in t and y. Solving (2.5.3) 
algebraically for y, it follows 


1 


0 Ga 


The strategy of example 2.5.1 may be applied to any differential equation of 
the form y’ = f(t, y) where f(t, y) can be decomposed into a product of two 
functions of t and y only. That is, if we can write 


f(t,y)=g(t)-hly) 


then we are able to separate the variables in the equation, writing all of the 
y-terms on one side (multiplied by y’), and writing all of the t-terms on the 
other. Any differential equation of the form y’ = g(t)- h(y) is said to be separable. 
We attempt to solve a separable differential equation by separating the variables 
and writing 


1 / 
—_y/ = g(t) (2.5.4) 
ayy ® 
Writing y’ in the alternate notation dy/dt, we have 
1 dy 
— = elt 229,59 
hy) de g(t) (2.5.5) 


Hence when we integrate both sides of (2.5.5) with respect to t, we find 


lipe- [sme 


Now, all of this work is only useful if we arrive at integrals we can actually 
evaluate. For example, if the left-hand side is [ sin /y dy, we are really no closer 
to solving for y than we were when considering the initial differential equation. 

In section 2.6, we will address ways to approximate the solution of such 
equations that we seem unable to solve analytically. For now, we consider a few 
examples of separable equations that we can solve, with more to follow in the 
exercises. 


Example 2.5.2 Find a family of solutions to the differential equation 
/ 
v= eft2y 
t 


and a solution to the corresponding initial-value problem with the condition 
that y(1) = 1. 
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Solution. First, we may write e’*2” = e'e2”. Thus, we have 
Y 


Y gt pv 
t 
Separating the variables, it follows that 
d 
—2y WY =f i 
oo 


Integrating both sides with respect to t, we may now write 


[eo ay= | te dt 


Using integration by parts on the right and evaluating both integrals, we have 
1 
ed =(t—l)e'+C 


To now solve algebraically for y, we first multiply both sides by —2. Since C is 
an arbitrary constant, —2C is just another constant, one that we will denote by 
C,. Hence 


e-% =-2(t-1)?+Q 
Taking logarithms and solving for y, we can conclude that 
1 
y= — 5 in(—2¢ —1e'+C) 


is the family of functions that provides the general solution to the original DE. 
To solve the corresponding IVP with y(1) = 1, we observe that 


1 1 
yQ)= —5In(—21 —1)e'4+C)= —5In(Ci) =1 
so In(C,) = —2, and therefore C; = e~?. The solution to the IVP is 


1 
y= = at =1)e' +e) 


Example 2.5.3 Is the following differential equation linear or nonlinear? 

ty’ +y'=4 
Classify the equation, and solve it to find a general family of solutions. 
Solution. We note that the given equation is nonlinear due to the presence of 


y’ in the equation; said differently, the left-hand side is not a linear combination 
of y and y’. To separate the variables, we first write 


ty’ =4 =y? 
Dividing both sides by t(4 — y”), it follows that 
1 dy 1 


4—-y-dt t 
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/ dy _ [dt 
4-y? J ¢t 


Evaluating both integrals, noting that the left-hand side requires integration by 
partial fractions or a table of integrals, we have 


1 2 
in(2*2) =inr ec 
4 y—2 


It only remains to solve for y algebraically. Using rules of logarithms and letting 


C= I1nK, we can write 
1/4 
In (ee 5) = In(Kt) 
y—2 


and therefore 


It now follows that 


Raising both sides to the fourth power, multiplying by (y — 2), and solving for 
y yields 


2.5.2 Exact equations 


We will consider one other type of nonlinear differential equation that may be 
solved analytically. We explore this through an example. Let us solve the DE 


(2+1y)y' + ty? =0 
We first observe that this equation is neither linear nor separable. The 
former is clear from the presence of y* and yy’; the latter is less obvious, but 
nonetheless true since the presence of the term (2+ f*y) makes it impossible to 


separate the variables t and y. We therefore explore another algebraic approach. 
Considering the derivative in differential notation, we have 


dy 
2+y)— +t =0 
+f ya ry 
and thus we may instead write 
(ty*)dt + (2+ t’y)dy =0 (2.5.6) 


This form may remind us of the total differential dg of a function ¢(t, y), as 
studied in multivariable calculus. Recall that for a differentiable function (tf, y), 
its total differential d¢ is given by 

dp = dt + pydy 
where $; = 0¢/dt and dy = 0f/dy. Note, therefore, from (2.5.6) that if there 
exists a function ¢ such that ¢; = ty? and by =2+ t?y, then (2.5.6) is actually 
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of the form d@ = 0, from which it follows that o(t, y) = K, for some constant 
K. Assuming that we can find the function ¢(t, y), we have then transformed 
the original differential equation in ¢ and y to an algebraic equation in t and y, 
one that we can hopefully solve for y. 

In the current example, let us suppose that such a function @(t, y) exists, 
and therefore that 


a@ ig 
— =f 2.5.7 
ried (2.5.7) 
and 
a 
La + ty (2.5.8) 
dy 


Integrating both sides of (2.5.7) with respect to t, it follows that 


1 
o(t,y) = shy ee) 


The function g(y) arises since the partial derivative with respect to t of any 
function of only y is zero. For ¢ to satisfy the condition in (2.5.8), we see that 
we must take the partial derivative with respect to y of our most recent result 
and set this equal to 2+ ty. Doing so, we find that 
ap 
dy 
Therefore, g’(y) = 2, so g(y) = 2y, and we have found that 


=Py+gi(y)=2+ty 


1 
o(t,y) = si eee 


Since it is the case that dp = 0, we know that ¢(t, y) = K, and therefore t and 
y are related by the algebraic equation 


1 
Gry tak 


From the quadratic formula, it follows that 


—2+/4+2Kt2 
t2 
and we have solved the original equation. The choice of “+” or “—” in the 
solution would depend on the value given in an initial condition. 
There are several important lessons to learn from this example. One is some 
terminology. Ifa differential equation can be written in the form 


M(t, y)dt +N(t, y)dy =0 (2.5.9) 


and there exists a function (t, y) such that ;(t, y) = M(t, y) and ¢,(t, y) = 
N(t, y), then since the differential equation is of the form d@(t, y) = 0, we say 
that the equation is exact. 

So, certainly a first check of whether an equation might be exact consists 
in trying to write it in the form of (2.5.9). Still, there is the issue of whether or 
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not @ exists. If @ does exist, and we further assume that M(t, y) and N(t, y) 
have continuous first-order partial derivatives, then it follows from Clairaut’s 
Theorem in multivariable calculus that 


M,(t, y= by = byt = Ny(t, y) 
Thus, if (2.5.9) is exact, then it must be the case that M, = Ny. Said differently, 
if My #% N;,, then the differential equation is not exact. In fact, it turns out that 
if My = N;, then the equation is guaranteed to be exact, but this result is much 
more difficult to prove. As a consequence of this, it suffices for us to check if 
My, = N; as a first step; if so, the equation is indeed exact and we then proceed 
to try to find the function @ in order to solve the differential equation. If not, 
another approach is needed. 
An example is instructive. 


Example 2.5.4 Solve the differential equation 


t 
oe tee 


Solution. We begin by observing that this equation is neither linear nor 
separable. Thus, writing the derivative in differential notation, we have 


t dy 
y dt 


and then rearranging algebraically, 


+In(ty)+1=0 


t 
(In(ty) + 1)dt + —dy =0 (2.5.10) 
y 


Letting M(t, y) = In(ty) +1 and N(t, y) = t/y, we observe that 
1 1 1 

My, = —t=-and N; = - 

ly y x 


and therefore, My = N;. Hence the differential equation is exact and we can 
assume that a function ¢ exists such that ¢; = M(t, y) and ¢, = N(t, y). 

Since the latter equation is more elementary, we consider dy = t/y, and 
integrate both sides with respect to y. Doing so, we find that 


o(t, y) =tlhy+ h(t) (2.5.11) 


From (2.5.10), @ must also satisfy ¢; = In(ty) + 1, so we take the partial derivative 
of both sides of (2.5.11) with respect to t to find that 


o; =Iny+h'(t) =In(ty) +1 
From this and properties of the logarithm, we observe that 
Iny+A'(t)=Int+Iny+1 


and thus h’(t) = Int + 1. It follows (integrating by parts and simplifying) that 
h(t) = tlnt. Thus, we have demonstrated that the original equation is indeed 
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exact by finding $(t, y) =tlny+ftlnt = ftln(ty). From here, we now know that 
o(t, y) = K, and so 


tIn(ty) = K 
Solving for y, we have that 


1 
yoo 


Exercises 2.5 
Classify each of the DEs in exercises 1-14 as linear, nonlinear, separable, or exact. 
Note that it is possible for an equation to satisfy more than one classification. 


1. y’ = 10y 


2. y’ = 10y +10 
3. y’ = 10y? 
4. y' = 10y*—10 
5. Py +y2=1 
dy 
6. et¥ = =] 
7 dt 
7. tdy —(y—1)dt=0 
: dy _5ty-t 
“dt 447 
dy 2 dy 
9. y—t— =6—3t?7— 
YO" d 
d —2 
ico 
dt t#+1 


11. (2+ t7)y’ +2ty =0 
12. 3y’y' +t? =0 
13.(y+ty+y=t 

14. y’ sin2t + 2ycos2t = 0 


Solve each of the DEs in exercises 15-28. 


is. y =10¥ 
16. y/ = 10y +10 
17. y! = 10y’ 


18. y’ = 10y*— 10 
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19. fy +y?=1 


20. etry Y =1 


dt 
21. tdy —(y—1)dt =0 
22, o = ~_ 
23. y+ = 630° 
24. o = — 


25. (2+ 17)y’ + 2ty=0 
26. 3y’y' +17 =0 

27. (y+t)y +y=t 

28. y’sin2t + 2ycos2t = 0 


Solve each of the IVPs stated in exercises 29-42. In addition, use a computer 
algebra system to plot an appropriate direction field for each, and sketch your 
solution within the plot. 


29. y'=10y, y(0)=3 

30. y’ =10y+10, y(0)=2 
31.y'=10y?, y(1)=4 

32. y/=10y?-10, y(1)=—1 
33. fy +y2=1, y(2)=0 


d 
34, ety 1, (0) =0 


dt 
35. tdy—(y—1)dt=0, y(1)=3 
36. o os y= 
37. yt 630%, y(1)=5 
38. o a y(0)=4 


39. (2+17)y+2tv=0, y(1)=1 
40. 3y*y'+t?7=0, y(0)=1 
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4..(iyt+tyt+y=t, y(0)=1 
42. y'sin2t+2ycos2t=0, y(w/4)=1/2 


43. Consider the IVP y’ = ,/y, y(0) = 0. Show that this IVP has more than 
one solution. Does this result contradict theorem 2.2.1? 


2.6 Euler’s method 


While we have learned to solve certain classes of differential equations 
explicitly—including linear first-order, separable, and exact equations—we 
must also develop the ability to estimate solutions to initial-value problems that 
we cannot solve analytically. Direction fields will play a key role in motivating 
our work, as we see in the following introductory example. 

Consider the initial-value problem 


d 
Y 1 Pat, y(0)=1 (2.6.1) 


This DE is not linear due to the presence of y*. In addition, since we can 
write y’ = t — y”, we see that the right-hand side may not be expressed as a 
product of two functions that each involve just one of the variables t and y. 
Thus, the equation is not separable. Finally, writing the equation in the form 
dy + (y” — t)dt = 0, it is straightforward to check that this equation is not 
exact. 

While it may seem frustrating to not be able to use any of the solution 
methods we have discussed so far, it is important to realize that many differential 
equations cannot be solved explicitly by analytic techniques. As such, we must 
explore how we can use our understanding of derivatives to estimate certain 
values of the solution to an IVP. 

For the given DE, writing y’ = t— y”, we can generate the direction field that 
is shown in figure 2.5. For the initial condition y(0) = 1, visually estimating how 
the solution y(t) will flow through the direction field, we can roughly estimate 
that y(1/2) © 0.75. But if we think about the calculus underpinnings of slope 
fields, we can be much more precise in our estimate. 

Recall that a direction field for a DE y’ = f(t, y) is created by observing that 
the slope of the tangent line to the solution curve y(t) at the point (%, yo) is 
f (to, yo). In the current example, we know that the solution to the IVP must pass 
through the point (f, yo) = (0, 1). At this point, the slope of the tangent line to 
the solution curve is m= 0 — 1* = —1; note also that m ~ Ay/At, where Ay is 
the exact change in y from t = 0 to t = 1/2, due to the fact that the tangent line 
approximates the solution curve for values near the point of tangency. Thus, as 
we step from f = 0 to t = 1/2, a change of 1/2 in the t-direction will generate 
an approximate change Ay = At-m=1/2-(—1) =—1/2iny. Therefore, from 
our original y-value of 1, a change of —1/2 leads us to the approximation that 
y(1/2) © 1/2. 
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Figure 2.5 The direction field for (2.6.1). 
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Figure 2.6 Taking one step to esti- 
mate y(0.5) in (2.6.1). 


Graphically, this estimation approach amounts to following the tangent 
line to the solution curve for some prescribed change in t. We can see this in 
figure 2.6, where it is immediately evident that our estimate is too small. In 
calculus, we learn that while the tangent line approximation to a differentiable 
function is good near the point of tangency, the approximation gets poorer 
and poorer the further we move from the point of tangency. Thus, a natural 
approach to the estimation problem at hand is to take a smaller step, then 
search the direction field for a new direction to follow, and then take another 
small step. In this situation, we are much like a hiker lost in the woods who is 
attempting to navigate by compass: just as the hiker is best served by checking a 
compass frequently, so are we best served by checking slopes frequently. 
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Figure 2.7 Two steps of size 0.25 to 
estimate y(0.5) in (2.6.1). 


So, rather than stepping the full distance of 1/2 from t = 0 to t = 1/2, let 
us first step to t = 1/4, find an estimate to y(1/4), and then proceed from there 
to estimate y(1/2). Starting at (0, 1), we know that the slope of the tangent line 
to the solution curve at this point is mp = f (0, 1) = —1. Stepping At = 0.25, it 
follows that we experience a change in y along the tangent line of Ay = mpAt = 
—1(0.25) = —0.25. Thus, we have that y(0.25) ~ y(0) + Ay = 1—0.25=0.75. 

Now we repeat this process from the point (0.25, 0.75). At this point, the 
slope of the tangent line to the solution curve is m = f (0.25, 0.75) = 0.25 — 
(0.75)? = —0.3125. Taking a step of At = 0.25, it follows that the change in 
y along the tangent line will be Ay = m, At = —0.3125(0.25) = —0.078125. 
Thus we have that y(0.5) © 0.75 — 0.078125 = 0.671875. We record our work 
graphically in figure 2.7, where our improved approximation is apparent, though 
the estimate is still too small. 


It is evident from our work in this first example that we can significantly 
improve our ability to estimate an initial-value problem’s solution at various 
t-values by developing an iterative process that uses reasonably small step sizes. 
In particular, we want to imitate the way in which we took two steps, but rather 
be able to take n steps using a step-size of At = h. Throughout, the key idea is 
always that we are estimating the solution function by determining its tangent 
line at a given point, and then following the tangent line for the determined step 
size. We observe that when moving along any line from a given point (f51q, Yold) 
to a new point (fnew, Ynew), it follows that 


Ynew = Yold + Ay 


= a ad At 


=Vold tm: At (2.6.2) 
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Another essential observation to make is that the slope m at each step of our 
approximation is given by m= y’ = f(t, y) in the differential equation that we 
are attempting to solve. In particular, if we have some approximation at time t 
given by yz, the slope of the tangent line to the solution curve at this point is 
given by f (t,, yz). Therefore, using this value for m in (2.6.2) and letting h= At 
be the step size, we now have 


Ynew = Yold + hf (told » Yold) (2.6.3) 

Hence, starting from the initial condition (f, yo), we are able to generate the 
sequence of points (t1, v1), .--, (tr, ¥n), where for each n > 0, 

thet =tn+h and yny1 = Yn t hf (tn, Yn) (2.6.4) 


The value y,, is an approximation of the exact solution value y(t,) at each step, 
so that y, © y(t,) for each n > 1. This method of approximating the solution to 
an initial-value problem is known as Euler’s method. 


Example 2.6.1 For the initial-value problem 


dy 2 
~+y=t, y(0)=1 
rae, , y(0) 


that we have just considered, apply Euler’s method to estimate the value of 
y(1/2) using h= 0.1. 


Solution. At the end of this section, the implementation of Euler’s method in 
a spreadsheet such as Excel will be discussed. Here, we simply report the results 
of such a computer implementation. If we use a step size of h = 0.1, we see that 
we will take five steps to move from fp = 0 to ts = 0.5, the point at which we 
seek to approximate y. Doing so yields the output shown in table 2.1. 

With just five steps, we can see in the direction field in figure 2.8, together 
with a piecewise linear plot of the approximate solution, that we have an 
apparently good estimate in the above table for how the actual solution to 
this IVP behaves on this interval. 


In the example we have been considering with various step sizes, one 
shortcoming is that we do not have a precise sense of how accurate our 


Table 2.1 
Euler’s method applied to the IVP 
y’ =t—y?, y(0) =1, using h=0.1 


th Yn 

0 1 

0.1 0.9 

0.2 0.829 

0.3 0.7802759 
0.4 0.749392852 
0.5 0.733233887 
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Figure 2.8 Five steps of size h = 0.1 
to estimate y(0.5). 


approximations are. One way to explore this issue is to apply Euler’s method to 
an IVP that we can solve exactly, and then compare our estimates with actual 
solution values. We do so in the following example. 


Example 2.6.2. Solve the IVP y’ = y — t, y(0) = 0.5 exactly, and use Euler’s 
method with the step sizes h = 0.2 and h = 0.1 to estimate the value of y(1). 
Hence analyze the effect that step size has on error in the method. 


Solution. We first observe that y’ = y — t isa linear first-order DE. Applying 
our work from section 2.3, we can determine that the solution to this equation 
is y= 1+t+ Ce’. The initial condition y(0) = 0.5 then implies that C= —1/2, 
so that the solution to the IVP is 


# 
AC ae aa ae 
If we apply Euler’s method with h = 0.2 and take 5 steps to determine y, 
at each, and also evaluate y(t,) at each stage, the resulting output is shown 
in table 2.2. 

Here, we observe the obvious pattern that the further we step away from 
the initial condition, the greater the error we encounter. This is a natural 
consequence of the use of linear approximations. 

To get a further sense of how the error at a given step depends on step size, 
we now apply the same method with h = 0.1. Doing so produces the results in 
table 2.3. For ease of display and comparison to the case where h = 0.2, we only 
report the results from every other step. 

By comparing the approximations in the preceding two tables at the 
common values of t = 0.2, 0.4, 0.8, 1 we can see that cutting the step size in 
half appears to have reduced the error by a factor of approximately 2. 
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Table 2.2 
Euler's method applied to the IVP y’ =y —t, y(0)=0.5, using h=0.2 
Euler Est. Solution Error 

th Yn y(tn) ly(tn) —Yn| 

0 0.5 0.5 0 

0.2 0.6 0.5892986 0.0107014 

0.4 0.68 0.6540877 0.0259123 

0.6 0.736 0.6889406 0.0470594 

0.8 0.7632 0.6872295 0.0759705 

1.0 0.75584 0.6408591 0.1149809 
Table 2.3 
Euler's method applied to the IVP y’=y —t, y(0) =0.5, using h=0.1 

Euler Est. Solution Error 

th Yn y(tn) ly(tn) —Yn| 

0 0.5 0.5 0 

0.2 0.595 0.5892986 0.0057014 

0.4 0.66795 0.6540877 0.0138623 

0.6 0.7142195 0.6889406 0.0252789 

0.8 0.728205595 0.6872295 0.0409761 

1 0.70312877 0.6408591 0.0622697 


In fact, there are sophisticated ways by which we can analyze the error of Euler’s 
method in general; we explore these and related issues in depth in chapter 7 
on numerical methods. And while Euler’s method can give us an intuitive 
sense for how a solution is behaving locally, we must note here that its error 
grows too fast to make it reliable. More sophisticated algorithms for numerically 
estimating solutions to differential equations exist; several of these are developed 
in chapter 7. 


2.6.1 Implementing Euler’s method in Excel 


Any spreadsheet program provides a straightforward way to implement Euler’s 
method. In our calculations, we will use Microsoft Excel. Recall that in Euler’s 
method, given an initial-value problem y/ = f(t,y), y(t) = yo, we seek 
approximations y;, y2,... such that y, © y(t,), where t, = % + ht, for some 
chosen step size h. In particular, we use the rule 


Ynt1 =Ynt+ hf (tn, Yn) 
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In a given row of the spreadsheet, we will view the data (as labeled in the cells 
below) step number n, step size h, t-value t,, approximate current y-value yy, 
slope f (tn, yn), and updated y-value yy+1. 

We will demonstrate the development of such an Excel spreadsheet for the 
particular example y’ = t — y”, y(0) = 1 using a step size of h = 0.1. To begin, 
we establish names for the various columns, say in cells Al, B1, C1, D1, E1, 
and F1, as shown below by entering the text “n”, “h”, etc., in the respective cells 
shown below. 


A B Cc D E F 


1 n h Co ni yon E(t _n,y_n)}; y n+l 


In row 2, we now enter the given data at step zero. In particular, in cell A2 we 
enter the step number (“0”), in B2 the chosen step size (“0.1”), in C2 the 
starting t-value (“0”), in D2 the starting y-value (“1”), and in E2, we apply the 
function f(t, y) to get the slope at the point at this step. That is, since in this [VP 
f(t, y) =t—y’, we enter in E2 the command “=Cc2 - D2*2”. We now also 
have enough information entered to compute y; in cell F2. Using the rule from 
Euler’s method, we know y; = yo + hf (t, yo). In our spreadsheet, this implies 
we must enter “=D2 + B2*E2”. Doing so, the result (yj) = 0.9) appears in cell 
F2. Now our spreadsheet should appear as shown. 


A C D E F 
1 n h ton yon £(t_n,y_n)} y_n+1 
2 0 OL 0 1 -1 0.9 


In row 3, we may now build subsequent entries based on existing data. To 
increase the step number, in A3 we enter “=A2 + 1”. Since the step-size 
stays constant throughout, in B3 we input “=B2”. Because the next t-value 
will be the preceding t-value plus the step size (tf) = % +h), we enter in 
C3 the command “=C2 + B2”. We also have the next y-value, so in D3 
we enter “=F2” to have this data available in the given row. The slope at 
step 1 is computed according to the same rule (given by f(t, y)) as it was at 
step 0. Hence in cell E3 we simply paste a copy of cell E2, which ensures 
that Excel uses the same computations, but updates them for the current 
step. Equivalently, we can directly enter in E3 the text “=C3 - D3%2”. 
Cell F3 computes the newest y-value: the same rule as in step 0 must be 
followed, so we can copy and paste cell F2 into F3, or equivalently enter in 
F3 “=D3 + B3*E3”. 
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At this stage, we see on the screen the following. 
A B Cc D E E 
1 n h tn yin £(t_n,y_n) yont+l 
2 0 0.1 0 1 -1 0.9 
3 1 0.7. 0.1 0.9 —0.71 0.829 


Now we can harness the power of Excel to compute as many subsequent steps as 
we like. By using the mouse to highlight row 3 (cells A3 through F3), and then 
placing the cursor on the bottom right corner of cell F3, we can then click and 
drag downward to fill subsequent rows with similar calculations. For example, 
doing so through row 5 (i.e., down to F7) yields the following table. 


A B c D E F 
1 n h ton yon £(t_n,y_n) yontl 
2 0 0.1 0 1 aan 0.9 
a 1 0.1 O.1 O49 <0. 72 0.829 
4 2 Oe 042 0.829 -0.487241 0.7802759 
0.7802759 | -0.30883048 |0.749392852 
0.749392852]-0.161589647]0.733233887 
0.733233887|-0.037631934|]0.729470694 


Besides the ease of iteration past the first two rows, there are further advantages 
Excel offers. One is that changing one appropriately-chosen cell will update all 
of our computations. For example, if we are interested in the change induced 
by a different step size, say h = 0.05, all we need to do is enter “0.05” in cell 
B2, and every other cell will update accordingly. In addition, if we desire to see 
the graphical results of our work, we can use Excel’s Chart Wizard. 

To plot our approximations, we can simultaneously highlight the t and 
y columns in our chart above (cells C2 through C7 and D2 through D7), and 
then go to Insert menu and select Chart (alternatively, we may click on the Chart 
Wizard icon on the toolbar). In the prompt window that arises, we choose “XY 
(Scatter)” and select one of the graph style options at the right by clicking on 
the desired one. By clicking “Next” in a few subsequent windows (in which 
advanced users can avail themselves of more options), we eventually get to a 
final window where our graph appears and the option to “Finish.” Clicking on 
“Finish,” the graph will appear in the spreadsheet and may be moved around 
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Figure 2.9 An Excel plot of an approximate solution to the IVP y’ = t—y?, y(0) = 1, 
for0<t<0.5. 


by clicking and dragging it accordingly. We see the resulting plot displayed as in 
figure 2.9. 


Exercises 2.6 


1. Consider the IVP y’ = t/y, y(1) =3 (where we assume that y is always 
positive). 


(a) Program Excel to use Euler’s method to determine an estimate of the 
value of y(3). Do so using a step size of h = 0.2. Show the results in a 
table and create an appropriate plot of the approximate solution. 

(b) Use an established solution method to determine an algebraic formula 
for the unique solution y(t) for the given IVP. Then determine y(t,,) 
exactly and use Excel to determine the error in your approximation at 
each step n. Finally, compare a plot of y(t) to your plot of the 
approximation above. 

(c) Use a computer algebra system appropriately to plot a direction field 
for the given differential equation. By hand, sketch a solution that 
satisfies the above IVP. Compare your work in (a) and (b) to the 
direction field. 


2. Consider the IVP y’ = (1—t)(1+ y), y(0) =2. 


(a) Program Excel to use Euler’s method to determine an estimate to the 
value of y(1.6). Do so using step sizes of h = 0.2 and h = 0.1. Show the 
results in a table and create an appropriate plot of the approximate 
solution. 
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(b) Use an established solution method to determine an algebraic formula 
for the unique solution y(t) for the given IVP. Then determine y(t,) 
exactly and use Excel to determine the error in your approximation at 
each step n. Finally, compare a plot of y(t) to your plot of the 
approximation above. 

(c) Use a computer algebra system appropriately to plot a direction field 
for the given differential equation. By hand, sketch a solution that 
satisfies the above IVP. Compare your work in (a) and (b) to the 
direction field. 


ies) 


. Consider the IVP y’ = (t —y)*/4, y(0) = 1/2. 


(a) Program Excel to use Euler’s method to determine an estimate to the 
value of y(1.5). Do so using step sizes of h = 0.1 and h = 0.05. Show 
the results in a table and create an appropriate plot of the approximate 
solution. 

(b) Explain why you cannot solve the given IVP explicitly. 

(c) Use a computer algebra system appropriately to plot a direction field 
for the given differential equation. By hand, sketch a solution that 
satisfies the above IVP. Compare your work in (a) to the direction field. 


2 
4. Consider the IVP y’ = e' — oe y= 47e>0, 


(a) Program Excel to use Euler’s method to determine an estimate to the 
value of y(2.2). Do so using step sizes of h = 0.1 and h = 0.05. Show 
the results in a table and create an appropriate plot of the approximate 
solution. 

(b) Use an established solution method to determine an algebraic formula 
for the unique solution y(t) for the given IVP. Then determine y(t,,) 
exactly and use Excel to determine the error in your approximation at 
each step n. Finally, compare a plot of y(t) to your plot of the 
approximation above. 

(c) Use a computer algebra system appropriately to plot a direction field 
for the given differential equation. By hand, sketch a solution that 
satisfies the above IVP. Compare your work in (a) and (b) to the 
direction field. 


In each of exercises 5-10, find an approximate solution to the stated IVP by 
using Euler’s method with h = 0.1 on the interval [0, 1]. In addition, find an 
exact solution and compare the values and plots of the approximate and exact 
solutions. 


5.y'+2ty=0, y(0)=—2 
6.y'=2y—-1, y(0)=2 
7.y¥-y=0, y(0)=2 
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8. (y')?+2y=0, y(0)=2 
9. yy? =8, y(0)=1 
10. (¢+1)yy'=—-1-y’, y(0)=2 


In each of exercises 11-14, find an approximation solution to the stated IVP by 
using Euler’s method with h = 0.1 on the interval [0, 1]. In addition, explain 
why it is not possible to solve the IVP exactly by established methods. 


11. (y’)? —2y?=t, y(0) =2 

12. y’—siny=2e', y(0)=0 
3.7¥+y=h, y(0)=2 

14. (t+ 1)yy’=-1—y?-#?, y(0)=2 


2.7 Applications of nonlinear first-order 
differential equations 


In this section, we explore two examples of nonlinear differential equations. 
It is important to recall that if an equation is nonlinear, it is possible that we 
may not be able to solve for the solution function explicitly. Regardless, we can 
use direction fields to qualitatively understand the behavior of solution curves; 
furthermore, if we are unable to find an exact solution function, we may employ 
Euler’s method to generate approximate solutions. 


2.7.1 The logistic equation 


We have recently learned that if a population is assumed to grow at a constant 
relative growth rate (or in a way such that the rate of change of the population 
is proportional to the size of the population), then the population function 
satisfies the initial-value problem 


P'=kP, P(0)=Po 


This leads to the familiar population model P(t) = Pp et, which is also studied 
in algebra and calculus courses. While this model is a natural one, it is also 
unrealistic: over significant periods of time, the function P will grow to values 
that become unreasonable since the function exhibits unbounded growth. 

Therefore, we now explore a more plausible population model. Let us 
assume we know that a given population P has the tendency over time to 
level off at a value A. The value A is often called the carrying capacity of the 
population; as the name indicates, it is the maximum population sustainable by 
the surrounding environment. It is natural to further assume that if P is close 
to, but less than A, then dP/dt will be small and positive, indicating that the 
population will be growing slowly. Similarly, if P is close to, but greater than A, 
we will want dP/dt to be negative and close to zero, so that the population will 
be decreasing slowly. 
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At the same time, we want to maintain the natural inherent exponential 
characteristic of growth, so when P is relatively small (in comparison to A), we 
would like for dP/dt to be approximately kP for some appropriate constant k. 
The combination of all these criteria led Dutch mathematician Pierre Verhulst 
(1804-1849) to propose the differential equation 


a 1 e (2.7.1) 
dt A — 


as a more realistic model of population growth, where k and A are positive 
constants. Equation (2.7.1) is known as the logistic differential equation. 

That the logistic equation may be solved in general (to determine an explicit 
solution P involving k and A) will be shown in the exercises. We consider here 
a specific example where k and A are given to provide further insight into the 
behavior of solutions to this equation. 


Example 2.7.1 A population P(t) exhibits logistic growth according to the 
model 


WF —onsp( i P(0) = 10 
dt 75)’ i 


a 


b 


Determine the values of P for which P is an increasing function 


Da ma 


Plot the direction field for the differential equation 


=a 


c) Determine the value(s) of P for which P is increasing most rapidly 


d 


— 
Say Ns 


Solve the IVP explicitly for P 
Solution. 


(a) To determine where P is increasing, we require that dP/dt > 0. If P < 0, 
note that (1 — P/75) > 0, which makes dP/dt < 0, so we need P > 0 and 
(1 — P/75) > 0 to make dP/dt positive. This occurs on the interval 
0 < P <75, so for these P values, P is an increasing function of t. We note 
further that if P > 75 or P < 0, then dP/dt < 0 and P is a decreasing 
function. Finally, it is evident that both P = 0 and P = 75 are equilibrium 
solutions, which makes sense given the physical interpretation of the 
population model. 


(b) Using familiar commands in Maple, we can plot the direction field for this 
differential equation. Note in advance the behavior we expect from our 
work above: two equilibrium solutions at 0 and 75, plus certain increasing 
and decreasing behavior. Finally, note that our analysis of the equation 
suggests a good range of values to select for P when plotting, say, 
P=-—10...100. As always, some experimentation with t may be necessary 
to get a useful plot. The plot is shown in figure 2.10. 
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Figure 2.10 The slope field for 
dP/dt =0.05P(1 — P/75). 


(c) To decide where P is increasing most rapidly, we seek the maximum value 
of P’. Graphically, we can observe in figure 2.10 that this appears to occur 
approximately halfway between P = 0 and P = 75. This is reasonable in 
light of the physical meaning of the logistic equation, since at this point 
the population has accumulated some substantial numbers to increase its 
growth rate, while not being close enough to the carrying capacity to have 
its growth slowed. 

We can determine this point of greatest increase in P analytically as 
well. Note that P’ = 0.05P(1 — P/75) = 0.05P — 0.0006P?, so that P’ is 
determined by a quadratic function of P. We have already observed that 
this quadratic function has zeros at the equilibrium solutions (P = 0 and 
P=75), and furthermore, we know that every quadratic function achieves 
is extremum (a maximum in this case, since the function 
g(P) = 0.05P — 0.0006P? is concave down) at the midpoint of its zeros. 
Hence, P’ is maximized precisely when P = 75/2. 


(d) Our final task is to solve the given initial-value problem explicitly for P. 

We first solve the differential equation 

ee 0.05P (1 — P/75) 

dt 
for P. Note that this equation is separable and nonlinear. Separating 
variables, we first write 

dP 

P(1— P/75) 

Because the left-hand side is a rational function of P, we may use the 


method of partial fractions to integrate the left-hand side of (2.7.2). 
Observe that 


= 0.05dt (72) 


1 OS 
P(1—P/75)  P(75—P) 
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Now, letting 
vhs) A B 
P(75—P) pt 75—P 
it follows that A = 1 and B = 1, so that (2.7.2) may now be written as 


1 1 
(5- =) dP = 0.05 dt (673) 
P P—75 


Integrating both sides of (2.7.3), we find that P must satisfy the equation 
In|P| —In|P — 75| = 0.05t+ C 
Using a standard property of logarithms, the left-hand side may be 


expressed as In|P|/|P —75|, and hence using the definition of the natural 
logarithm, it follows a 


— 90.05t+C _ x,0.05t 
P—75 
where K = e©. Since K is an arbitrary constant, the sign of K will absorb 
the + that arises from the presence of the absolute value signs, and thus we 
may write 
P 

P=75 
Multiplying both sides by P — 75 and expanding, we see that 
and gathering all terms ae P on the left, 


= Keo-05t 


Thus, it follows that 
—75Ke®F 

~ T= Ke0.05t 
Multiplying the top and bottom of the right-hand side by —1/(Ke®-°), it 
follows that 

75 

1 — Me—9-05t 
where M = 1/K. In this final form, it is evident that as t + oo, P(t) > 75, 
which fits with the given carrying capacity in the original problem. At this 
point, we can use the initial condition P(0) = 10 to solve for M; doing so 
results in the equation 10 = 75/(1 — M), which yields that M = —13/2, 
and thus 


P= 


_ 75 
“4 iB e—0.05t 
A plot of this function (shown in figure 2.11), along with comparison to 


our work throughout this example, demonstrates that our solution is 
correct. 
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Figure 2.11 The solution P = 75/ 


(1+ Be~°*) to the IVP dP/dt = 
0.05P(1 — P/75), P(0) = 10. 


For the general logistic differential equation 


dP ( =) 
—=kP([1-—— 
dt A 


an argument similar to the one we just completed can be used to show that the 
solution to this equation is 


_ A 
1+ Me-kt’ 


where M is a constant that may be determined by an initial condition. This fact 
will be shown in exercise 1 for this section. 


P(t) 


2.7.2 Torricelli’s law 


Suppose that a water tank has a hole in its base with area a, through which water 
is flowing. Let h(t) be the depth of the water and V(t) be the volume of water 
in the tank at time t. At what rates are h(t) and V(t) changing? 

Evangelista Torricelli (1608-1647) discovered what has come to be known 
as Torricelli’s law, which describes the way water in an open tank will flow 
through a small hole in the bottom. To develop this law, let us consider® 
how water molecules will rearrange themselves as water exits the tank and the 
relationship between the potential and kinetic energy of a small mass m of water. 
The potential energy lost as a small mass m of water falls from a height h > 0 is 
mgh, where g is the gravitational constant; at the same time, the kinetic energy 
gained as an equal mass m exits the tank is $mv", where v is the velocity at 
which the water is flowing. Equating the potential and kinetic energy, we find 


© Our approach follows that of R. D. Driver in “Torricelli’s law: An Ideal Example of an Elementary 
ODE,” Amer. Math. Monthly, 105(5) (May 1998), pp. 453-455. 
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that mgh = $mv’, so that 


v= /2gh 
This model assumes that no friction is present; a slightly more realistic model 
takes a fraction of this velocity, depending on the viscosity. For simplicity, we 
will consider the ideal case where friction is not considered. 
If we now consider the water exiting the tank, it follows that the rate of 
change dV /dt of volume in the tank is determined by the product of the area a 
of the hole and the exiting water’s velocity v. In other words, 


dV 
a= —a,/2gh (2.7.4) 


At this point, observe that we have related the rate of change of volume to the 
height of the water in the tank at time ¢. Instead, we desire to either relate 
dV /dt and V or dh/dt and h. Of course, height and volume are related. If we 
assume that A(y) denotes the tank’s cross sectional area at height y, then integral 
calculus tells us that the volume of the tank up to height h is given by 


Furthermore, by the Fundamental Theorem of Calculus, differentiating V(h) 
implies dV /dh = A(h), and thus by the chain rule, 


dV dV dh Ate 
dt dhdt ~~ dt 
Using this new expression for dV /dt in (2.7.4), it follows that 


Ath). = —a,/2gh (2.7.5) 


which is a differential equation in h. In particular, this nonlinear equation 
predicts, given a tank of a particular shape (as determined by A(h)) with a 
hole of area a, the behavior of the function h(t) that describes the height of the 
water at time t. We explore this further in the following example. 


Example 2.7.2. For a cylindrical tank of height 2m and radius 0.3 m, filled 
to the top with water, how long does it take the tank to drain once a hole of 
diameter 4 cm is opened? 


Solution. In this situation, the cross sectional area A(h) of the tank at height 
h is constant because each is a circle of radius 0.3, so that A(h) = 0.097. In 
addition, the area of the hole in square meters is a = z (0.02) = 0.000427, and 
the gravitational constant is g = 9.8 m/s”. Since we have already established that 
A(h)dh/dt = —a,/2gh, we therefore conclude that h satisfies the equation 


dh 
CL = —0.00047 V19.6h 
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h(t) 


2.0 
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100 150 


Figure 2.12 The slope field for dh/ 
dt = —0.019676Vh. 


Simplifying, it follows that 


dh 
== —0.019676Vh 


Separating variables, we have 

h—'/? dh = —0.019676dt 
and upon integrating, it follows that 

2n'/? = —0.019676t + C 
Thus, 

h(t) = (Cy — 0.009838t)? 


Because h(0) = 2, Cy = V2. Furthermore, with h(t) = (./2 — 0.009838t)?, we 
can see that h(t) = 0 when t = 143.75 sec, at which time the tank is empty. A 
plot of h(t) confirms precisely the behavior observed in the direction field in 
figure 2.12. 


Exercises 2.7 


1. For a population P(t) that exhibits logistic growth according to the 


general model 
dP P 
—=kP|{1——), P(0)=P 
a ( =) (0) = Po 


(a) Determine the values of P (in terms of A and k) for which P is an 
increasing function. 

(b) Sketch by hand the direction field for the differential equation, clearly 
indicating the role of the constant A in your sketch. 

(c) Determine the value(s) of P (in terms of A and k) for which P is 
increasing most rapidly, and justify your answer. 


N 


ies) 


wa 
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(d) Solve the initial-value problem explicitly for P to show that 


P(t) = —————_ 
(2) 1+ Me-Kt 


and determine M in terms of A and Pp. 


. The growth of an animal population is governed by the equation 


where P(t) is the number of animals in the herd at time t. The initial 
population is known to be 125. Determine the solution P(t), sketch its 
graph, and decide whether there will ever be more than 125 or fewer than 
50 animals present. 


. Consider the differential equation dP/dt = —0.02P? + 0.08P. 


(a) What are the equilibrium solutions to this equation? 


P(t) =3. 


. Consider a fish population that grows according to the model 


dP 2 
ae = 0.05P — 0.000005P 


where t is measured in years, and P is measured in thousands. 


(a) Determine the population of fish at time t if initially P(0) = 1000. 
What is the carrying capacity of the population? 

(b) Suppose that the fish population is established as growing according to 
the above model in the absence of fish being removed from the lake. 
Suppose that harvesting begins at a rate of 20 000 fish per year. How 
does the differential equation governing the fish population change? 
Explain. 

(c) Plot a direction field for the updated differential equation you found in 
part (b). Discuss the new equilibrium solutions for the fish population. 
Can you solve the IVP with P(0) = 1000? 

(d) How would the DE change if wildlife biologists began planting 30 000 
fish per year in the lake, and no harvesting occurred? 


. Solve the initial-value problem 


dP 5 
—=6-7P+P*, P(0)=2 

dt 

Sketch your solution curve P(t) and explain why it makes sense in light of 
the equilibrium solutions to the given equation and your understanding of 


where dP/dt is positive and negative. 
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A cruise ship leaves port with 2500 vacationers aboard. At the time the 
boat leaves the dock, ten recent visitors of an amusement park are sick 
with the flu. Let S(t) denote the number of people at time t who have had 
the flu at some time since leaving port. 


(a) Assuming that the rate at which the flu virus spreads is directly 
proportional to the product of the number of people who have had the 
flu times the number of people not yet infected, write a differential 
equation whose solution is the function S(t). Explain why the 
differential equation is a logistic equation. 

(b) Solve the differential equation you found in (a). Assume that four days 
into the trip, 150 people have been sick with the flu. Clearly show how 
all constants are identified, and sketch a graph of your solution curve. 

(c) How many people have been sick seven days into the trip? How long 
would the boat have to stay at sea for half the vacationers to get ill? 


. A cylindrical tank of height 4m and radius 1 m is full of water. A small hole 


of diameter 1 cm is opened in the bottom of the tank. Use Torricelli’s law 
to determine how long it will take for all the water to drain from the tank. 


. A cylindrical tank of height 1.2 m and radius 30 cm is originally full of 


water. A small hole is opened in the bottom of the tank, and after 15 min, 
the water in the tank has dropped 10 cm. According to Torricelli’s law, 
how large is the hole and how long will it take the tank to drain? 


. Consider a tank that is generated by taking the curve x = ,/y and 


revolving it about the y-axis. Assume that the tank is full of water to a 
depth of 1.2 m and that a hole of diameter 1 cm is opened in the bottom. 
Use Torricelli’s law to determine how long it will take for all the water to 
drain from the tank. 


Suppose a hemispherical bowl has top radius of 30 cm and at time t = 0 is 
full of water. At that moment a circular hole with diameter 1.2 mm is 
opened in the bottom of the tank. Use Torricelli’s law to determine how 
long it will take for all the water to drain from the tank. 


For an open cylindrical tank, Torricelli’s law tells us that if a small hole is 
opened, the height of the water at time t obeys the IVP 


dh 
ae TK A, to) = ho 
where k is a constant that depends on the radius of the tank and the radius 


of the hole. In this exercise, we will take k = 1. 


(a) Explain why theorem 2.2.1 does not guarantee a unique solution to the 
IVP 
dh 
—=-Vh, h(1)=0 
A » A(1) 
(b) Explain why it is physically impossible to determine the height of the 


water at time t < 1 in a tank which satisfies h(1) = 0. 
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(c) Show that for any c < 1, the function 
1 PA2 = 
h(t) = (5c—5t) ift<c 
0 ift>c 
is a solution to the IVP in (a). 
(d) Explain how the result of (c) can be interpreted physically in light of 


the time when the tank becomes empty. Compare your findings to 
those in (a) and (b). 


2.8 For further study 


2.8.1 Converting certain second-order DEs to first-order DEs 


Linear second-order differential equations such as 


y+ p(t)y’+q(hy =f(t) (2.8.1) 


will be the focus of upcoming work in chapters 3 and 4. But there are some 
second-order equations we can solve at present. For example, if q(t) = 0 
in (2.8.1), then we can perform a process called reduction of order to convert the 
equation to a first-order one. 


(a) Consider the second-order equation y” + p(t)y’ = f(t). Using the 
substitution u = y’, convert the equation to a new first-order DE involving 
the function u. 


(b) Use a standard solution technique to state the solution u to the differential 


equation in (a) in terms of p(t) and f(t). (Your answer will involve 
integrals.) 


(c) Explain how you would use your result in (b) to find the solution y to the 
original DE. 


(d) Use reduction of order to solve each of the following second-order IVPs. 
(i) y"+2y =4, y(0)=2, y/(0)=1 
(ii) y”+tan(t)y’=t, y(0)=1, y'(0)=0 
(iii) y+ oy =?, y(0)=0, y(0)=1 
(iv) y+ by =4-b6 yO)=1, y(0)=1 
(e) Reduction of order can be performed on certain nonlinear differential 
equations as well. For instance, suppose that we have an equation of form 
y" =g(y')h(t) (2.8.2) 


Show that the substitution u = y’ converts (2.8.2) to a first-order equation 
in uw. Explain how you would approach solving the new equation in u. 
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(f) Solve each of the following second-order IVPs. 
Qy Sar: gO=1,. 70) S0 


/)2 
Gi) $= o)=2, y= 1 


(iii) y” = e+”, y(0)=0, y'(0)=0 
(iv) y= /y, y(0)=3,  y'(0)=5 


2.8.2 How raindrops fall 


The following questions and discussion are based on the article “Falling 
Raindrops” by Walter J. Meyer’. 

When a raindrop falls, various forces act upon it. We explore several differ- 
ent models that show the importance of adjusting assumptions appropriately to 
match physical conditions. Let us first assume that the only force acting upon 
the raindrop is the acceleration due to gravity. Under this assumption, Galileo 
(1564-1642) hypothesized that the falling raindrop would gain an extra 32 ft/s 
in velocity for every second for which it falls. In other words, the acceleration of 
the raindrop is constant and equal to 32 ft/sec”. 


(a) Let y(t) denote the distance (in feet) traveled by the rain drop after it has 
been falling for t seconds. Write an initial-value problem involving y(t) 
based on the above assumption. Solve this IVP; be sure to introduce 
appropriate initial conditions based on the context of the problem. 


(b) Assuming that the raindrop starts from rest at an elevation of 3000 ft, how 
long does it take the raindrop to fall to earth? What is the raindrop’s 
velocity when it hits the ground? Why is this model unrealistic? 


(c) We next must attempt to account for the air resistance the raindrop 
encounters through a slightly more sophisticated model. For a raindrop 
having diameter d < 0.00025 ft, this model, sometimes known as Stoke’s 
law, states that the acceleration of the raindrop due to gravity is opposed 
by an acceleration directly proportional to the velocity of the raindrop at 
that instant. Suppose that the constant of proportionality is given by c/d’, 
where c * 3.29 x 10~° ft/s is an experimentally determined constant. 
Write a new IVP (again involving y(t) and its relevant derivatives) for the 
raindrop having diameter d. Do not yet attempt to solve this equation. 
Leave d as an unknown constant. 


(d) Letting v = y’ and using the fact that the raindrop starts from rest, convert 
the IVP in (c) to a first-order IVP involving v. Using d = 0.00012 ft (which 
can be considered a drizzle), produce a slope field corresponding to the 


7 See Applications of Calculus, MAA Notes Number 29, pp. 101-111. 
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differential equation in v. On this slope field, sketch a graphical 
approximation of the solution to the stated IVP. Describe the behavior of 
the raindrop’s velocity based on the slope field you constructed in the 
problem above. 


(e) In the model in (d), we will say that the long-term limiting velocity of the 
raindrop is its terminal velocity, denoted vYYerp. Calculate this terminal 
velocity by using the IVP to answer the following questions: What is the 
initial velocity of the raindrop? What is the equilibrium solution of the 
differential equation? What happens to the velocity of the raindrop if it 
ever reaches the equilibrium value? Why, in view of the differential 
equation, must the velocity of the raindrop increase from its initial value 
to the equilibrium value? 


(f) Use your result from (e) to determine the terminal velocities for raindrops 
having diameters of 0.00009, 0.00012, and 0.00015 ft, respectively. Graph 
Vterm as a function of d, and comment on the phenomena observed. 


(g) Solve the IVP from (d) explicitly for v. Graph your solution, and then use 
your solution to calculate v4erm as well. 


(h) Assuming that a raindrop of diameter 0.00012 ft starts from rest at 3000 ft, 
how long does it take the raindrop to fall to the ground? What is its 
velocity at the instant it hits the ground? Do your answers surprise you? Is 
it raining hard or barely raining when raindrops are this size? 


(i) When the diameter of the raindrop becomes too large, the force of air 
resistance on the raindrop becomes so appreciable that Stoke’s model loses 
accuracy as well. This leads to a third model, known as the velocity-squared 
model. This model states that when a raindrop has diameter d > 0.004 ft, 
the acceleration due to gravity is opposed by an acceleration directly 
proportional to the square of the velocity of the raindrop at that instant. 
Here the constant of proportionality is given by k/d, where k © 0.00046. 


(j) Repeat questions (c), (d), and (e) for the velocity-squared model. 
Compare your findings with those of Stoke’s model. For example, how do 
the terminal velocities of small raindrops compare with those of large 
raindrops? For which type of raindrop, small or large, does the terminal 
velocity increase more rapidly as a function of diameter? 


(k) Finally, explicitly solve the IVP arising from the velocity-squared model 
for the velocity function v(t). Graph your solution v(t) for an appropriate 
choice of d and compare the result to the results in (j). 


2.8.3 Riccati’s equation 


The Ricatti equation 


y+plt)yt+q(t)y? =f (t) (2.8.3) 
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and its study are attributed to the Italian mathematician Jacobo Riccati 
(1667-1748). Observe that this nonlinear equation is a modification of the 
standard linear first-order equation y’ + p(t)y = f(t). Through the following 
steps, we will use a change of variables to transform the Riccati equation into a 
linear, second-order differential equation. 


(a) We consider a change of variables to convert (2.8.3) from being a 
8 8 
differential equation in y to a new equation in v. Let v be a function that 
satisfies the relationship 
v' = q(t)y(t)v(t) 
(i) Differentiate v’ = qyv with respect to t to show that 
v" = (qyv)' = qd yv + qy'v + qv’ (2.8.4) 
(ii) Show that q'yv = q'v'/q. 


(b) Multiply both sides of the Riccati equation (2.8.3) by qv and use (i) and 
(ii) to show that the left-hand side may be written 


vqy' + vqpy + vq?y* =v" + (> — 4) vy (2.8.5) 


(c) Use your work in (b) to show that the Riccati equation may now be 
re-expressed as the second-order equation in v given by 


a (> = 4) v’ — vf =0 (2.8.6) 


(d) Explain how you would solve the Riccati equation in the special case when 
f(t) = 0. Note particularly that to solve (2.8.6) with f(t) = 0, you must 
reduce the order of the equation through an appropriate substitution, say 
u= Vv’. See section 2.8.1 for further details on this technique. In addition, 
note that your goal is to find the solution y to the original equation (2.8.3). 
Be sure to explain how the functions v and u are used in this process. 


(e) Solve the following differential equations, each of which is a Riccati 
equation. 


(i) y’ +2y +4y? =0 
(ii) y +iyt+ty? =0 
(iii) y’ + ytant+ y* cost =0 


2.8.4 Bernoulli's equation 


The Bernoulli brothers, James (1654-1705) and John (1667-1748), contributed 
to the solution of 


y+p(thy=aq(t)y", n#l (2.8.7) 
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the so-called Bernoulli equation. We will explore the approach credited to John 
through the following prompts. Similar to the Riccati equation, the Bernoulli 
equation may be transformed into a linear differential equation through a clever 
change of variables. 


(a) First, multiply (2.8.7) by y—” to obtain 


y "y+ plt)y’ "= q(t) (2.8.8) 
Next, consider the change of variables v = y'~". Compute v’ to show that 
v=(l-ny"y’ (2.8.9) 


Now use (2.8.8) and (2.8.9) to show that v satisfies the linear first-order 
equation 


v+(1—n)p(t)v=(1—n)d(t) (2.8.10) 
(b) Explain why in the cases when n = 1, n= 2, q(t) = 0, and p(t) = 0 the 


Bernoulli equation reduces to familiar equations whose solutions are 
known. 


(c) Solve these differential equations, each of which is a Bernoulli equation. 
(i) y +2y= ty? 
(ii) y’ + ty = 373 
(iii) y’ + ycott=y> sint 
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Linear systems of differential equations 


3.1 Motivating problems 


In section 1.1, we considered how the amount of salt present in a system of 
two tanks can be modeled through a system of differential equations. In that 
particular example, we assumed that the volume of solution in each tank (as seen 
in figure 3.1) remains constant and all inflows and outflows happen at the 
identical rate of 5 liter/min, and further that that the tanks are uniformly mixed 
so that the salt concentration in each is identical throughout each tank at a given 
time f. 

With the additional premises that the volume of solution in tank A is 
200 liters and the independent inflow entering A carries water contaminated 
with 4g/liter of salt, we can develop a differential equation that models x)(t), 
the amount of salt (in grams) in tank A at time t. Likewise, by presuming that 
tank B holds solution of volume 400 liters and the inflow entering B carries 
a concentration of salt of 7g/liter, a similar analysis produces a differential 
equation whose solution is x(t), the amount of salt (in grams) in tank B at 
time t. In particular, we found in (1.1.6) that the following system of differential 
equations arose: 


dx, xX} x 

= === ft 

dt 20 +80 (3.1.1) 
dx XX 435 

dt 40 40 


With our experience in linear algebra, we can now represent this system in 
matrix notation. In particular, if we simultaneously consider the amounts of 


187 


188 Linear systems of differential equations 


Figure 3.1 Two tanks with inflows, outflows, 
and connecting pipes. 


salt x;(t) and x2(f) as entries in the vector function 
| x(t) 
a al 


dx, /dt 

74) — | M1 

x(t)= ee (3.1.2) 
Moreover, in (3.1.1) we recognize the familiar form of a matrix product in the 
terms involving x; and x. Specifically, 


—x1/20+x2/80 re WA i 
x1 /40—2x2/40 1/40 —1/40] | x2 


we know that 


(3.1.3) 


With the observations from (3.1.2) and (3.1.3) substituted into (3.1.1) and 
replacing the quantities 20 and 35 with the appropriate vector, we may now 
write the system of differential equations in the form 


,_[-1/20 1/80 20 
= -| 1/40: =1/40|"* | 35 to) 
Letting A be the matrix of coefficients that multiplies the vector x and b the 
vector [20 35]", we can also write the system in (3.1.4) in the simplified form 


x’=Ax+b (3.1.5) 


This form reminds us of the familiar nonhomogeneous linear first-order 
differential equation with constant coefficients, for instance, an equation such as 


y =2y+5 (3.1.6) 


In this chapter, we will study similarities between (3.1.5) and (3.1.6) with 
the specific goal of learning how to completely solve nonhomogeneous 
linear systems of differential equations with constant coefficients such as the 
system (3.1.4). We will be especially interested in the role that linear algebra 
plays in identifying certain characteristics of the coefficient matrix A that enable 
us to find all solutions to the system. 

Before we proceed to an in-depth study of linear systems of differential equa- 
tions, at least one more motivating example is appropriate. A spring-mass system 
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. eae 


displacement —y(t) ee - “Kf 


equilibrium 


Figure 3.2 A spring-mass system shown at two different points in time; —y(t) denotes 
the displacement of the mass from equilibrium (where displacements below the t-axis 
are considered positive). 


is a physical situation that models vibrations; for example, such a system arises 
any time a mass attached to a spring is set in motion. We choose to envision this 
situation vertically, as seen in figure 3.2, though one can also imagine the mass 
resting on a table and moving horizontally. 

We consider some of the physics of basic springs and motion under the 
influence of gravity in order to develop a differential equation that describes the 
spring-mass system. Initially, the mass will stretch the spring from its natural 
length. Hooke’s law states that the force necessary to stretch a spring a distance 
x from its natural length is given by the equation 


F(x) = kx 


where k is the spring constant. Assume that the mass stretches the spring a 
distance Lo. Then from Hooke’s law, when the system is in equilibrium, we see 
that the force F, exerted by the spring must be 


F; = —kL 


Here the minus sign indicates that the force is opposing the natural downward 
displacement of the spring. Note particularly that we view the downward 
direction as positive. We also know that gravity acts on the mass with force 
Fg given by 


Fp = mg 


If the system is in static equilibrium, we know that the sum of the two forces 
is zero. In other words, 
F,+F,=0 
and therefore 
mg = kLo 


Once the system is set in motion by some initial force or displacement, we 
track the location of the mass at time t with a function y(t). In particular, 
y(t) represents the displacement of the mass from the equilibrium position at 
time ¢; note that y = 0 is the equilibrium position of the system. We continue to 
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designate the downward direction as positive, so y(t) > 0 means that the mass 
is below the equilibrium position, while y(t) < 0 means the mass is above the 
equilibrium position. We can see the role y(t) plays in figure 3.2 as it tracks 
the displacement of the mass from equilibrium and thus traces out a curve with 
respect to time. 

We can now use Newton’s second law to obtain a differential equation that 
governs the system. The forces that act on the mass are: 


* Gravity, with Fy = mg. 


+ The spring force F;. Note now that at a given time t the displacement of 
the spring from its natural length is Lo + y(t), so that by Hooke’s law we 
have F; = —k(Ip + y). 


- A possible damping force Fy. Motion may be damped due to air resistance, 
friction, or some sort of external damping system (usually called a 
dashpot). We assume that damping forces are directly proportional to the 
velocity of the mass. Under this assumption, it follows that Fy = —cy’. 
Again, the minus sign indicates that this force opposes the motion of the 
mass. The positive constant c is called the damping constant. 


+ Finally, there may be an external driving force present (such as the 
periodic force that drives a piston in an engine). We call this a forcing 
function F(t); the role of forcing functions will be considered in detail later 
on in this chapter. 


Newton’s second law demands that the resultant force (that is, the sum of all 
the forces) on the mass must be equal to ma, where a is the body’s acceleration 
(which is also y’’). Summing all the aforementioned forces and equating the 
result with ma = my”, we find 


my” = Fy + F, + Fy + F(t) (3.1.7) 
Using the formulas we developed earlier and substituting in (3.1.7) yields 
my” =mg—k(Io+y) —cy' + F(t) (3.1.8) 


Now recall that mg — kLo = 0, rearrange (3.1.8), and divide by m. This leads 
us to the standard form of the differential equation that governs a spring mass 
system, 


Cc k 1 
y" +—y +—y= —F(t) (3.1.9) 
m m m 


Note that (3.1.9) is a nonhomogeneous linear second-order differential equation. 

To see how such a second-order linear differential equation is linked 
to a system of linear differential equations, let’s consider the specific example 
where c= 1, m= 1, k = 6, and F(t) = 0, which results in the equation 


y"+y' +6y=0 (3.1.10) 


The eigenvalue problem revisited 191 


If we introduce the functions x, and x through the substitutions y = x, and 
y’ = x2, then x;(t) represents the displacement of the mass at time ¢ and x)(f) 
is the velocity of the mass at time f. 

Observe first that 


xi =% (3.1.11) 


Moreover, since x, = y”, we can rewrite (3.1.10) as x, + x. + 6x; = 0. 


Equivalently, 
x} = —6x] — x2 (3.1.12) 
Thus (3.1.11) and (3.1.12) generate the system of differential equations 


(3.1.13) 


which may also be expressed in matrix form as 


x)= E a (3.1.14) 


We have therefore shown that the linear second-order differential equa- 
tion (3.1.9) that describes a spring-mass system may be converted to the system 
of linear first-order equations (3.1.14) through the substitution x; = y, x. =y’. 

In fact, any linear higher order differential equation may be converted 
through a similar substitution to a system of linear first-order equations. 
Therefore, by learning to understand and solve systems of linear equations, 
we will be able to determine the behavior of higher order linear equations as 
well. It is this fact that motivates us to study systems of linear equations prior to 
the study of higher order single equations. 


3.2 The eigenvalue problem revisited 


As we begin our study of linear systems of first-order differential equations, we 
are ultimately interested in two main questions: the first asks, for a linear system 


x’ = Ax such as 
x= 2nd x 
AD. J. 


how can we explicitly solve the system for x(t)? In addition, what is the long- 
term behavior of the solution x(t) to such a system? How does its graph 
appear? We start our investigation by thinking carefully about the meaning 
of the matrix equation x’ = Ax and compare our experience with the single 
first-order differential equation x’ = ax. Note that we naturally begin with the 
homogeneous system x’ = Ax; later we will consider nonhomogeneous systems 
of the form x’ = Ax +b. In every case, we seek a vector function x(t) that solves 
the given system. An elementary example is instructive. 
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Example 3.2.1 Solve the linear system x’ = Ax, where 


[9] 


Explain the role that the eigenvalues and eigenvectors of A play in the general 
solution, and graph and discuss the solution curves for different choices of initial 
conditions. 


Solution. First, we observe that the system 


xi j —3 0 —3 Ollx 
H--[3 [2 Je] 


tells us that we seek two functions x;(t) and x2(t) such that xj = —3x, and 
x, = —x2. Because the matrix of the system is diagonal, the problem is especially 
simple. In particular, the system is uncoupled, which means that the differential 
equation for dl does not involve x2 and the equation for x does not involve x;. 
From our experience with linear first-order equations, we know that the 
general solution to x; = —3x) is x(t) = cq e~>' and that the solution to X= —X 
is x(t) = coe. Writing the solution to the system as a single vector, we have 


3 
x= | = | (3.2.2) 


Rewriting x in another form sheds further insight on the key components of this 
solution. Writing x as the sum of two vectors, we find 


—3t 
_ ce 0 -_ —3t 1 —t 0 
x= | 0 + Ba =ce A + oe "| (3.2.3) 


Here, we can make a key observation about the eigenvalues and eigenvectors of 
A: because A is diagonal, its eigenvalues are its diagonal entries, 4; = —3 and 
Az = —1. Moreover, its corresponding eigenvectors may be easily confirmed to 


be the vectors 
1 0 
v= B and v2 = H 


Thus, in (3.2.3), we see the interesting fact that the solution has the form x = 
cye*!'v, + co e*2!v>; the eigenvalues and eigenvectors therefore play a central 
role in the system’s behavior. 

Finally, we explore the solutions to several related initial-value problems 
for select initial conditions. If we have the initial condition x(0) = [4 0]', we 
see in (3.2.3) that c) = 4 and c~ = 0, so that the solution to the IVP is 


x(t) =4e*! | 


Two key observations can be made about this solution curve: one is that its 
graph is a straight line, since for every value of t, x is a scalar multiple of the 
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vector [1 0]!. Note particularly that the direction of this line is given by the 
eigenvector corresponding to 4; = —3. The other important fact is that e~** > 0 
as t > oo, and therefore x(t) — 0, so that the solution approaches the origin as 
time increases without bound. 

For the initial condition x(0) = [0 5]', it follows from (3.2.3) that c; = 0 
and c = 5, and thus the solution to this IVP is 


xtty=set Ei 


Similar observations about the behavior of this solution may be made to those 
noted above for the first chosen initial condition: this solution curve is linear 
and approaches the origin as t — oo. 

Finally, if we consider an initial condition that does not correspond to an 
eigenvector of the system, such as x(0) = [4 5]', (3.2.3) tells us that c, = 4 and 


© = 5, and thus 
ae ae! -1|9 
x = 4e i +5e 1 


This last solution’s graph is not a straight line. As seen in figure 3.3, which shows 
the three different solutions based on the differing initial conditions, we see the 
consistent behavior that every solution tends to the origin as t > 00, as well as 
that the eigenvectors play a key role in how these graphs appear. We will discuss 
this graphical perspective further in sections 3.4 and 3.5. 


The long-term behavior of the solutions to the system (3.2.1) in 
example 3.2.1 suggests that every solution tends to the zero vector. In fact, the 
origin itself is a solution, a so-called constant or equilibrium solution. That is, if 


x, solution through (0,5) 


~—— solution through (4,5) 


solution through (4,0) 


i 


1 


Figure 3.3 Plots of solutions to three IVPs for 
the system in example 3.2.1. Arrows indicate 
the direction of flow along the solution curve 
as time increases. 
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we consider whether there is any constant vector x that is a solution to x’ = Ax, 
it follows that x’ = 0, and thus x must satisfy Ax = 0. From our work with 
homogeneous linear equations, we know that x = 0 is always a solution to this 
equation, and thus the zero vector is a constant solution to every homogeneous 
linear system of first-order differential equations. In sections 3.4 and 3.5 we will 
investigate the so-called stability of this equilibrium solution. 

There is a second perspective from which we can see how eigenvectors 
and eigenvalues arise in the solution of linear systems of differential equations. 
After constant solutions, the next simplest type of solutions to such a system 
are straight-line solutions. In other words, solutions whose graph is a straight 
line in space form a particularly important type of solution to a system. In the 
preceding example, we saw two such straight-line solutions: each occurred in 
the direction of an eigenvector and passed through the origin. 

In search of a general straight-line solution to x’ = Ax, we know that any 
such solution must have the form x(t) = f(t)v, where f(t) is a scalar function 
and v is a constant vector. This form guarantees that x(t) traces out a path that 
is a straight line through 0 in the direction of v. In order for x(t) to satisfy the 
system, we observe that since x’(t) = f’(t)v, the equation 


f'(t)v = A(f(t)v) (3.2.4) 


must hold. Moreover, since f(t) is a scalar, the linearity of matrix multiplication 
allows us to rewrite (3.2.4) as 


f' (Ov = f(tAv (3.2.5) 


Equation (3.2.5) is strongly reminiscent of the equation we use to define 
eigenvalues and eigenvectors: Ax = Ax. In fact, if f’(t) =Af(t), then (3.2.5) 
implies that 


Af (t)v = f(t)Av 


Further, if f(t) 4 0, then Av = Av, and A and v must be an eigenvalue- 
eigenvector pair of A. 

It is therefore natural for us to want f to satisfy the single differential 
equation f’(t) = Af(t). From our work in chapter 2, we know that f(t) = Ce*! 
is the general solution to this equation. Substituting this form for f in (3.2.5), 
we now observe that 


ety = e* Av (3.2.6) 


A 


and since e“' is never zero, we can simplify (3.2.6) to 


Av = Av (3.2.7) 


which is satisfied precisely when v is an eigenvector of A with corresponding 
eigenvalue A. 

Our most recent work has demonstrated that if x(t) is a function of the 
form x(t) = e*'v that is a solution to x’ = Ax, then (A,v) is an eigenpair of 
the coefficient matrix A. In fact, the converse also holds (as will be shown in 
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the exercises), so that the following result is true for any n x n system of linear 
first-order differential equations. 

Theorem 3.2.1 Let Abe ann x n matrix. The vector function x(t) = e*'v isa 
solution to the homogeneous linear system of first-order differential equations 
given by x’ = Ax if and only if v is an eigenvector of A with corresponding 
eigenvalue i. 


We close this section with one more example to demonstrate theorem 3.2.1 
and one of its important consequences. 


Example 3.2.2 Consider the system of differential equations given by 
%; = —2x, — 2x2 
Xx, = —4x] 


Write the system in the form x’ = Ax and show that A has two real eigenvalues 
with corresponding linearly independent eigenvectors. Verify by substitution 
that for each eigenvalue-eigenvector pair, x(t) = ev is a solution of the system. 
In addition, show that any linear combination of such solutions is also a solution 
to the system. 


Solution. First, we observe that the system can be expressed in the form 
x’ = Ax by using the matrix 
—2 —2 
Ae Es 1 


We briefly review the process of determining the eigenvalues and eigenvectors 
of a matrix A; in most future occurrences, we will use Maple to determine this 
information using the commands introduced in section 1.10.2. 


Since the eigenvalues are the roots of the characteristic equation, we solve 
det(A — AI) = 0. Doing so, 


0 = det(A —Al) 
—2-X —-2 
= det] 4s Sa-3—= A= 
=1742A—-8=(A+4)(A—2) 
so the eigenvalues of Aare A = —4andd =2. 
To find the eigenvector v that corresponds to A = —4, we solve the equation 


(A — (—4I))v = 0. Row-reducing the appropriate augmented matrix yields 


2-2 0/7 }1 —-1 0 
-4 4 0 0 00 
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which shows that a corresponding eigenvector is any scalar multiple of the 
vector v; =[1 1]'. Similar computations show that for 4 = 2, a corresponding 
eigenvector is V7 =[l1 — 2]. 

We now verify directly what theorem 3.2.1 guarantees: that x)(t) = 


etl 1] and x(t) = e”"[1 —2]" are solutions to the given system 
of equations. Observe first that 

x) (t) = —4e"** | (3.2.8) 
and that 


=e 4] =—4e 4 H (3.2.9) 


Equations (3.2.8) and (3.2.9) confirm that indeed xj(t) = Axj(t) and 
demonstrate the role that eigenvalues and eigenvectors play in the solution. 
Similarly, for the function x(t), 


and 


=e! |_| = 2¢*! | (3.2.10) 


This shows that x4(t) = Ax2(t). 

Finally, we are asked to show that any linear combination of x;(t) and 
X2(f) is also a solution to the differential equation. While we could confirm this 
somewhat laboriously through direct computations, it is much easier to work 
more generally and consider known properties of differentiation and matrix 
multiplication. 

In particular, differentiation is a linear operator and we know that if we let 
y(t) = c.x1(t) + ©x2(t) it follows that 


y (t) = (axi(t) + @xa(t))’ = x} (t) + ox,(t) (3.2.11) 
Similarly, matrix multiplication is a linear process, so 
Ay(t) = A(cyx) (tf) + €X2(t)) = cy Ax) (ft) + C2 Ax(f) (3.2.12) 


Since we have already established that x} (t) = Ax;(t) and x(t) = Axo(f), it 
follows that 

CX} (t) + x5 (t) = c. Ax) (t) + cAx)(t) 
so by (3.2.11) and (3.2.12) we have shown that y’(t) = Ay(t) and thus indeed 
every linear combination of x;(t) and x2(t) is also a solution to x’ = Ax. 
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Example 3.2.2 provides the foundation for much of our study of linear systems 
of differential equations. It shows that when we can find real eigenvalues 
and eigenvectors, these lead us directly to solutions of the system. In addition, 
any linear combination of such solutions is also a solution to the system; we 
state this formally in the next theorem. 


Theorem 3.2.2. If (A, v1), (Az, V2),.--, (Ax, Vg) are eigenpairs of an n x n 
matrix A and c),..., ck are any scalars, then 


x(t) = cye" vy + ce*!vy ++ + cperk vy 


is a solution to x’ = Ax. 


In upcoming sections, we will determine whether we have found all of 
the solutions to a given system, address some subtle issues that arise when we 
cannot find enough real eigenvalues and eigenvectors, and better understand the 
graphical and long-term behavior of solutions. The exercises in this section will 
help further illuminate the roles of eigenvalues and eigenvectors as well as some 
of the issues that arise when there is an insufficient number of real eigenvectors 
for a given system’s matrix. 


Exercises 3.2 
In exercises 1-7, compute by hand the eigenvalues and eigenvectors of the given 
matrix. 
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8. Consider the system of differential equations given by 
xX, = —2x1 +3x 
x = x, — 4x2 
(a) Determine a matrix A so that the system may be written in the form 


x’ = Ax. 


(b) 

(c) Compute the eigenvalues and eigenvectors of A. 

(d) Determine all straight-line solutions to x’ = Ax. 
) 


combinations of the straight-line solutions from (d). 
(f) Solve the initial-value problem x’ = Ax, x(0) = [1 2]". Discuss the 
graphical behavior of this solution. 


\o 


. Consider the system of differential equations given by 
x, = —x, + 2xy 
A = —7x, + 8x 
(a) Determine a matrix A so that the system may be written in the form 


x’ = Ax. 


) 

) Compute the eigenvalues and eigenvectors of A. 
(d) Determine all straight-line solutions to x’ = Ax. 

) 


combinations of the straight-line solutions from (d). 
(f) Solve the initial-value problem x’ = Ax, x(0) = [—2 0]. Discuss the 
graphical behavior of this solution. 


10. Consider the system of differential equations given by 
x, = 2x1 + 3x2 
A = —4x) 
(a) Determine a matrix A so that the system may be written in the form 


x’ = Ax. 


) 

) Compute the eigenvalues and eigenvectors of A. 
(d) Determine all straight-line solutions to x’ = Ax. 

) 


combinations of the straight-line solutions from (d). 

(f) Explain how you could find this same general solution without 
determining eigenvalues and eigenvectors. (Hint: focus on x(t) 
first.) 

(g) Solve the initial-value problem x’ = Ax, x(0) = [0 1]". Discuss the 
graphical behavior of this solution. 


The eigenvalue problem revisited 199 


11. Consider the system of differential equations given by 
% = —-2x, +x» 
A = —2x2 


(a) Determine a matrix A so that the system may be written in the form 
/ 
x = Ax. 


(b) 

(c) Compute the eigenvalues and eigenvectors of A. 

(d) Determine all straight-line solutions to x’ = Ax. 
) 


combinations of your straight-line solutions from (d). 
(f) Attempt to solve the initial-value problem x’ = Ax, x(0) = [1 ij; 
What does this tell you about the proposed general solution in (e)? 


12. Consider the system of differential equations given by 
Xi = 2x, + 9x2 
x} = —x) — 2x) 
(a) Determine a matrix A so that the system may be written in the form 
x’ = Ax. 
(b) Determine all constant solutions to x’ = Ax. 


(c) Compute the eigenvalues and eigenvectors of A. 
(d) Are there any straight-line solutions to x’ = Ax. Why or why not? 


13. Consider the system of differential equations given by 
-, = —3x, + x 
xh = 3x1 — x 
(a) Determine a matrix A so that the system may be written in the form 
x’ = Ax. 
(b) Determine all constant solutions to x’ = Ax. Compare and contrast 
your findings with preceding exercises. 
(c) Compute the eigenvalues and eigenvectors of A. 
(d) Determine all straight-line solutions to x’ = Ax. How many such 
solutions exist? 
(e) Find a more general solution to x’ = Ax by taking all possible linear 
combinations of your straight-line solutions from (d). 
(f) Solve the initial-value problem x’ = Ax, x(0) = [3 0]. Discuss the 
graphical behavior of this solution. 


14. Consider the system of differential equations given by 
x, = 3x, + +%3 
xy = x1 +3 +3 


/ 
X3 = X1 +X + 3x3 
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(a) Determine a matrix A so that the system may be written in the form 
i 
x’ = Ax. 


(b) 

(c) Compute the eigenvalues and eigenvectors of A. 

(d) Determine all straight-line solutions to x’ = Ax. 
) 


combinations of your straight-line solutions from (d). 
(f) Solve the initial-value problem x’ = Ax, x(0)=[1 1 1]". Discuss the 
graphical behavior of this solution. 
15. Consider the system of differential equations given by 
x, = 8x] — xm — 11x3 
x} = 18x) — 3x, — 19x35 
x4 = 2x1 — x — 5x3 
(a) Determine a matrix A so that the system may be written in the form 


x’ = Ax. 


(b) 

(c) Compute the eigenvalues and eigenvectors of A. 

(d) Determine all straight-line solutions to x’ = Ax. 
) 


combinations of your straight-line solutions from (d). 
(f) Solve the initial-value problem x’ = Ax, x(0)=[1 1 1]". Discuss the 
graphical behavior of this solution. 


Recall from section 3.1 that a second-order linear differential equation whose 
solution is y(t) may be converted to a system of first-order linear equations 
whose solution is x =[x, x]! through the substitution x; = y, x = y’. See, 
for example, the discussion following (3.1.10). In exercises 16-22, convert each 
given higher order differential equation to a system of first-order equations 
through an appropriate substitution. 


16. y’—4y =0 

17. y”+y'—12y=0 

18. y"+y +y=0 

19. y’ —2y’-8y=e! 

20. y’” + 3y" + 3y' +y=0 

21. ¥” —6y’ +5y =0 

22. y) + ay!” —5y"”+y/—9y =0 


In sections 1.1 and 3.1, we showed how two connected tanks containing a solute 
lead to a system of linear first-order differential equations. In exercises 23-26, 
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set up, but do not solve a system of differential equations or initial-value problem 
whose solution would give the amount of salt in each tank at time t. Write each 
system in matrix form. 


23% 


24, 


25. 


26. 


A system of two tanks is connected in such a way that each of the tanks has 
an independent inflow that delivers salt solution to it, each has an 
independent outflow (drain), and each tank is connected to the other with 
an outflow and an inflow. The relevant information about each tank is 
given in the table below. 


Tank A Tank B 
Tank volume 100 liters 200 liters 
Rate of inflow to the tank 5 liters/min 9 liters/min 
Concentration of salt in inflow 7 g/liter 3 g/liter 
Rate of drain outflow 4 liters/min 10 liters/min 
Rates of outflows to other tank | to B: 3 liters/min | to A: 2 liters/min 


Suppose that in exercise 23 all of the given information remains the same 
except for the fact that instead of saltwater flowing into each tank, pure 
water flows in; that is, the concentration of salt in the entering solution is 
0 g/liter for each tank. 


Ina closed system of two tanks (i.e., one for which there are no input flows 
and no output flows), the following information is given. Tank A is filled 
with 100 liters of solution whose initial concentration is 0.25 g/liter. 

Tank B is filled with 50 liters of solution whose initial concentration is 

3 g/liter. The two tanks are connected with two pipes having flows in 
opposite direction; mixed solution from Tank A flows to Tank B at a rate 
of 4 liters/min. Similarly, mixed solution flows from Tank B to Tank A at a 
rate of 4 liters/min. 


In a closed system of three tanks (i.e., one for which there are no input 
flows and no output flows), the following information is given. 


Tank A Tank B Tank C 


Tank Volume 100 liters 150 liters 125 liters 


Rates of outflows | to B:3 liters/min | to C: 1 liter/min | to A: 4 liters/min 
to other tanks 


Rates of outflows | to C: 4 liters/min | to A:3liters/min | to B: 1 liter/min 
to other tanks 
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Tank A is filled with 100 liters of solution whose initial concentration is 
8 g/liter. Tank B is filled with 150 liters of solution whose initial 
concentration is 3 g/liter. Tank C is initially filled with 125 liters of pure 
water. The three tanks are connected with pipes having flows in opposite 
directions; flow rates are given in the table above. 


27. Show that if (A, v) is an eigenpair of the matrix A, then x(t) = visa 


solution to the homogeneous system of linear differential equations given 
by x’ = Ax. 


3.3 Homogeneous linear first-order systems 


In preceding sections, we have encountered examples of systems of two 
(or three) linear differential equations in two (or three) unknown functions. 
More generally, a linear system of n differential equations in n unknown functions 
(or simply, a linear system) is a collection of differential equations for which we 


seek unknown functions x;(t), ..., X(t) when given n equations with coefficient 
functions aj(t) and b;(t) in the form 

dx 

oA = ayy (t)x1 + ay2(t)x2 +--+ + ain(t) xn + 1 (1) 

dx 

a= Ay (t)xy + azo (t)x2 ++ ++ + Azn(t) xn + y(t) 

dx; 

ae = Ani (t) x1 + ana (t)x2 + +++ + Ann(t)xn + bn(t) 


It will be convenient to write the above system in matrix form. If we let x denote 
the vector function whose entries are x(t) = [x;(t)], A(t) the n x n matrix of 
functions whose entries are A = [a;j(t)], and b(t) the vector of functions whose 
entries are b = [b;(t)], then the above system can be rewritten simply as 


x’ (t) = A(t)x(t) + b(t) (3.3.1) 


In much of our work, we will suppress the independent variable t and write 
x’ = Ax +b. Moreover, it will most often be the case that, as in examples 3.2.1 
and 3.2.2, the matrix A has all constant entries. Indeed, from this point on, 
unless otherwise noted, we will assume the matrix A has constant entries. 

In the event that b = 0, we say that the linear system is homogeneous. If 
b is nonzero, the system is nonhomogeneous. We have already encountered 
in theorems 3.2.1 and 3.2.2 the important facts that for any homogeneous 
first-order linear system x’ = Ax, every solution of the form x(t) = e*'v requires 
(A, v) to be an eigenpair of A, and that any linear combination of such solutions 
is also a solution to the system. 

Just as with individual differential equations, to each system of equations 
we can associate an initial-value problem. Using the matrix notation (3.3.1), if 
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we assume that we also have the initial condition x(t) = Xo, then we have the 
standard initial-value problem 
x’(t) = A(t)x(t), x(t) = xo (3.3.2) 
We next consider a theoretical result (whose proof we omit) that will 
frame our overall work with systems. The following theorem is analogous to 
the earlier result we encountered in theorem 2.2.1 regarding the existence of a 
unique solution to the initial-value problem associated with a single first-order 
differential equation. 


Theorem 3.3.1 In (3.3.2), let the entries of the matrix A(t) be continuous 
functions on a common interval J that contains the value fp. Then there exists a 
unique solution x(t) to (3.3.2) on the interval I. 


In particular, we note that in examples where the matrix A has constant 
coefficients, the entries are continuous functions, so that the IVP x’ = Ax, 
x(0) = Xo is guaranteed to have a unique solution. We now examine this result 
more closely through a particular example, revisiting a problem we considered 
in the preceding section. 


Example 3.3.1 Determine the unique solution to the IVP given by 


x= E | x, x(0)= | (3.3.3) 


Solution. We note, by theorem 3.3.1, that a unique solution exists. Moreover, 
from our work in example 3.2.2, every function of the form 


x(t)= ce“ H + oe |_| (3.3.4) 


is a solution to the system x’ = Ax. We now explore whether we can find 
constants c, and c) in order that the function x(t) will satisfy the given initial 
condition in (3.3.3). 

The initial condition in (3.3.3) and (3.3.4) together imply 


[-S]=n100=a2 |] ae] 2] 
7 i] Ve 2 7 | (3.3.5) 


We note that since the vectors [1 1]! and [1 — 2]! (which are eigenvectors 
of A) are linearly independent and span R*, we are guaranteed a unique 
solution to (3.3.5). Row-reducing the system (3.3.5), we find 


1 1 -5]_ [1 0 -5 
1 —2 3 01-8 


or equivalently 
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x(t) = -ie H — 5c | 


is the unique solution to the given initial-value problem. 


Thus, we have shown 


One especially important observation from example 3.3.1 can be made regarding 
the point at which we solved for the constants c; and c): we were guaranteed 
not only that a solution existed, but also that it was unique, due to the fact that 
two linearly independent eigenvectors of the 2 x 2 matrix A were present in the 
general solution (3.3.4). Indeed, if we imagine wanting to solve any similar IVP 
with the freedom to choose any initial vector x(0), it will be necessary that x(0) 
can be written as a linear combination of the vectors v, and v2, whenever the 
general solution has form 


x(t)=c ey, + oe y, 


This situation is indicative of the general fact that for all 2 x 2 linear systems 
of DEs, we must have two parts to the general solution, in order to be able 
to uniquely determine the constants c; and c). Note further that for the 
solutions x1 (t) = e*!!v, and x2(t) = e*2'v2 we encountered above, x; (0) = vj 
and x2(0) = vz are linearly independent and form a basis for R2. This 
linear independence of the constant vectors v; and vz turns out to have an 
important analog in the linear independence of certain solutions to the system 
of differential equations. 

More generally, we can consider these same issues for an n x n homogeneous 
system. Because theorem 3.3.1 guarantees the existence of a unique solution to 
the corresponding IVP for every initial condition x(0) € R”, when we think 
about the structure of the general solution, it is natural to think this solution 
will have form 


X(t) = cX1(t) + €2X2(t) + +++ + CnXn(t) 
where {x;(0), x2(0),...,X,(0)} form a basis for R”. 


These observations, together with our earlier work in theorem 3.2.2 that showed 
that every linear combination of solutions to the general homogeneous linear 
system of DEs (3.3.1) is also a solution to (3.3.1), help explain why the set of all 
solutions to x’ = Ax, where A is a matrix with constant coefficients, is a vector 
space of dimension n. We state this formally in the following result. 


Theorem 3.3.2 The set of all solution vectors to the homogeneous linear 
system x’ = Ax, where A is an n x n matrix with constant coefficients, forms a 
vector space of dimension n. 


Theorem 3.3.2 shows us that in order to solve an n x n system of 
homogeneous first-order DEs, we must find n linearly independent solutions to 
the system. Said differently, the general solution to x’ = Ax will have form 


x(t) = cx (t) + @x2(t) +--+ + cyXp(F) (3.3.6) 
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where x)(t),...,X,(t) are linearly independent functions. Thus, our search for 
the general solution to the system requires us to find these n linearly independent 
functions x} (ft), ...,X,(t). While we need to discuss in more detail what it means 
for vector functions (rather than constant vectors) to be linearly independent, 
we can first note that we know by theorem 3.2.1 that when (Aj, v;) is an eigenpair 
of A, the function 


xi(t) = ey; 


is a solution to x’ = Ax. This fact, combined with theorem 3.3.2, implies the 
result depicted in theorem 3.3.3. 


Theorem 3.3.3 IfAisan x n matrix with n linearly independent eigenvectors 
V1,V2,---,Vn, with corresponding eigenvalues Aj,A2,...,An (where the 
eigenvalues are not necessarily distinct), then the general solution to x’ = Ax is 


x(t)=cq ety, + oe ty, free Cree Vy (3.3.7) 


The linear independence of v1, ..., V, guarantees that we can solve the IVP 

x’ = Ax, x(0) = xo for every possible choice of x9 € R”, since we may write 

Xo = C1Vi + V2 + +++ + Cnn 
for a unique set of values c, ..., C,. This shows that the general solution (3.3.7) 
indeed captures all possible solutions to the system. 

In our original study of the eigenvalue problem in section 1.10, we 
observed (and proved in one of the exercises) that eigenvectors corresponding 
to distinct (real!) eigenvalues are linearly independent. This yields an important 
consequence of theorem 3.3.3: if A has n distinct real eigenvalues, then A has n 
linearly independent (real) eigenvectors. In particular, the following corollary 
is true. 


Corollary 3.3.4 If A is an nm x n matrix with n distinct real eigenvalues 
A1,A2,---,An, then the corresponding eigenvectors v),V2,...,V, are linearly 
independent and the general solution to x’ = Ax is 


x(t) = cre" vy $ ce") +--+ Cyl Vy (3.3.8) 
We now consider a specific example in which we see corollary 3.3.4 at work. 
Example 3.3.2 Determine the general solution to the homogeneous first-order 


system of DEs x’ = Ax and determine the unique solution to the initial-value 
problem 


-4 1-1 1 
x’=Ax=|]-1 -—2 5]x, x(0)=]-2 
3 3 0 3 


| We are interested in real solutions to the system x’ = Ax; when eigenvalues and eigenvectors are 
complex, additional work is needed. See section 3.5. 
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Solution. We begin by computing the eigenvalues and eigenvectors of 
A. Using the Eigenvectors(A) command in Maple, we find that the 


eigenvalues of A are A, = —6, A472 = —3, A3 = 3, with corresponding eigenvectors 
1 1 
Y= —1 ,sv2= 1 5v3= 1 
1 0 1 


Since the eigenvalues of A are distinct, we know immediately that the 
corresponding eigenvectors are linearly independent, and therefore by 
corollary 3.3.4 that the general solution to the given system is 


1 1 
x(t)= ce | -1]+ae74 | 1] +oe**] 1 (3.3.9) 
1 0 1 
To solve the IVP with 
1 
x(0) = | —2 
3 


we set t = 0 in (3.3.9) and apply the given condition, which leads to the vector 
equation 


1 1 0 1 
cq} -l}+o}]1l}]+o]1}]=]-2 
1 0 1 3 


Writing this equation in augmented matrix form and row-reducing shows that 


110 1 100 2 
-1 11 -2;-);0 10 -1 
101 3 001 1 


and, therefore, the solution to the IVP is 


0 
x(t) =2e | -1] —e**} 1] +e] 1 
0 1 


From corollary 3.3.4, we know that if we have an n x n matrix A with n linearly 
independent real eigenvectors, then we can completely solve the system x’ = Ax. 
But what if A lacks n real linearly independent eigenvectors? While we will 
encounter this situation in more detail in section 3.5, here it is worthwhile to 
note that we will still be seeking n linearly independent solutions x(t), ..., X(t) 
to the general system. For these vector functions, the fundamental meaning of 
linear independence remains the same as it does for constant vectors: the set of 
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vector functions {x;(f), ...,Xy(t)} is linearly independent if and only if the only 
values of c),..., Cy that make 


cyX1(t) +++» + c)X,(t) =0 (3.3.10) 


true for all values of t are c) = --- = c, = 0. Testing the linear independence of 
vector functions is more involved; to do so, we introduce a new concept and a 
corresponding theorem. 


Definition 3.3.1 Given vector functions x) (ft), ...,x,(t) where each x;(t) € R” 
for all t, the Wronskian of these functions is 


WIx1,...,Xn] = det[x],...,Xy] (3.3.11) 


That is, the Wronskian of a set of n vector functions, each of which lies in R”, is 
the determinant of the n x n matrix whose columns are x],...,Xp. 


The Wronskian enables us to easily test whether or not vector functions 
are linearly independent through the following theorem, which will be stated 
without proof. 


Theorem 3.3.5 Let x)(t),...,Xn(t) be vector functions continuous on an 
interval I, where x;(t) € R” for all ¢ € I. If at any point f in I, W[x),..., xy] 
(t) # 0, then {xi (t),...,xX»(t)} is linearly independent on I. 


We observe that this result appears reasonable since it is analogous to 
two statements that appear in the Invertible Matrix theorem: for a set of n 
constant vectors in R”, we know that the set is linearly independent if and only 
if the determinant of the matrix whose columns are these vectors is nonzero. 
Theorem 3.3.5 is a generalization of this result to the situation where the vectors 
are not constant. 

An example will now demonstrate the use of the Wronskian in showing a 
set of vector functions is linearly independent. 


Example 3.3.3 Consider the vector functions x; = [e~' —e~! e*]', x) = 
[3e2* ef — 2e7*]7, and x3 = [e** ee! e*]". Are x), x2, and x3 linearly 
independent? 


Solution. We use the Wronskian of x1, x2, and x3 to determine their linearly 
independence. Observe that 
ee: 
W[x1, X2, X3] = det | —e~’ et et 
e-t —2e2t edt 
= e' (eet zs 20>! e*") _ ae (=e te _ ce) 


he eee _ ae ae 
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= e'(3e’*) =36" (—9e" eee") 
= 10e £0 


Since W[x1, x2, x3] 4 0 for at least one t-value (in fact, for all t), it follows by 
theorem 3.3.5 that the functions x,, x2, and x3 are linearly independent. 


In conclusion, we now know that when we encounter a homogeneous system of 
n linear first-order differential equations in n unknown functions, the set of all 
solutions to the system forms an n-dimensional vector space. Hence, we seek n 
linearly independent solutions to the system x’ = Ax. Such a set x],...,X, of n 
linearly independent solution vectors to this system is called a fundamental set. 
Moreover, given a set of fundamental solutions x, ...,x, to x’ = Ax, on some 
interval I, the general solution to the system is 


X(t) = yxy +--+ + CnXn 


We have also seen that if an n x n matrix A has n linearly independent 
real eigenvectors, then these eigenvectors and their corresponding eigenvalues 
generate a fundamental set for the system x’ = Ax. In subsequent sections we 
will find that, even in the case when an insufficient number of real eigenvectors 
exists, the eigenvalue problem enables us to build a fundamental set. Moreover, 
we will investigate how fundamental solutions allow us to fully understand the 
graphical behavior of solutions and the stability of equilibrium solutions to the 
system. 


Exercises 3.3 


1. Ifx’ = Ax represents the system of differential equations given by a 4 x 4 
matrix A with constant entries, how many linearly independent solutions 
to the system do we need to find in order to determine the general 
solution? What if A is 7 x 7? 


2. Consider the second-order differential equation y” + y = 0. Using the 
substitutions y = x, and y’ = xp, convert the given second-order 
differential equation to a system of first-order equations. What is the 
dimension of the solution space to the system? What does this tell you 
about the dimension of the solution space to the original second-order 
equation? 


3. Consider the third-order differential equation y"” + 3y”+3y’+y=0. 
Using the substitutions y = x), y’ = x), and y” = x3, convert the given 
differential equation to a system of first-order equations. What is the 
dimension of the solution space to the system? What does this tell you 
about the dimension of the solution space to the original third-order 
equation? 


Homogeneous linear first-order systems 209 


In exercises 4-8, use the Wronskian to determine if the given set of vector 
functions is linearly independent. 


4, 
5. 
6. 
7. 


oO © 


10. 


11. 


12. 


xi(t)=[e* —e*]", x(t) =[e* 2e%]" 
x,(t)=[cost sint]!,x(t)=[sint cost]! 
x)(t)=[e~? —e~"]", xo(t) =[—3e7* 3e7*]" 


xi(t)=[e’ —e! e']',x,(t)=[e”! 2e7* —3e7*]™,x3(t) = 
[4e—4# e4t et 


.x1(t)=[cost —sint 0]',x:(t)=[sint cost 0]',x3(t)=[0 0 e']" 


. Explain why for a set of two vector functions, the Wronskian is unneeded 


to check for linear independence. (Hint: what is the simple test for a pair 
of constant vectors to be linearly independent?) 


Let x’ = Ax be given by the matrix 


wee 


(a) Compute the eigenvalues and eigenvectors of A. Explain why these 
enable you to find the general solution to x’ = Ax. 

(b) State the general solution to the system. 

(c) Solve the IVP with the initial condition x(0) = [3 2]’. 


Let x’ = Ax be given by the matrix 


=f 


(a) Compute the eigenvalues and eigenvectors of A. Explain why you have 
found one linearly independent solution to the system, but still need to 
determine another. 

(b) Verify through direct substitution that x;(t) = te*‘[1 OJ’ + e*‘[0 1)" 
is a solution to the given system x’ = Ax. 

(c) Show that the solution you found in (a) above and the solution x(t) 
in (b) are linearly independent, and hence state the general solution to 
the system. 

(d) Solve the IVP with the initial condition x(0) = [3 2]". 


Let x’ = Ax be given by the matrix 


= 


(a) Compute the eigenvalues and eigenvectors of A. Explain why, despite 
the repeated eigenvalue, you have found two linearly independent 
solutions to the system. 

(b) State the general solution to the system. 
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(c) Solve the IVP with the initial condition x(0) = [3 2]’. 
(d) Explain how you could solve the original system given in this problem 
without using eigenvalues and eigenvectors. 


13. Let x’ = Ax be given by the matrix 


fy 


(a) Compute the eigenvalues and eigenvectors of A. Explain why the 
eigenvalues and eigenvectors do not produce any real linearly 
independent solutions to the system. 

(b) Verify through direct substitution that x;(t) = [cost sin t]! and 
x)(t)=[—sint cost]! are solutions to the given system x’ = Ax. 

(c) Show that the solutions you verified in (b) are linearly independent, 
and hence state the general solution to the system. 

(d) Solve the IVP with the initial condition x(0) = [3 2]°. 


14. Let x’ = Ax be given by the matrix 


5 6 2 
A=]0 -1 -8 
1 @Q@ -2 


(a) Compute the eigenvalues and eigenvectors of A. Explain why your 
work determines two linearly independent solutions to the system, 
but that one additional linearly independent solution remains to be 
found. 

(b) Verify through direct substitution that 
x3(t) = te*[5 —2 1]'+e*[1 1/2 0]! isa solution to the given 
system x’ = Ax. 

(c) Show that the set of three solutions from (a) and (b) is linearly 
independent, and hence state the general solution to the system. 

(d) Solve the IVP with the initial condition x(0) =[3 2 1]'. 


15. Consider the second-order differential equation y” + y = 0. Convert 
this equation to a system of first-order equations and solve the system. 
Use your work to state the general solution y to the original equation. 
(Hint: See exercise 13.) 


16. Convert the second-order differential equation y” + 3y’+2y=0toa 
system of first-order equations and solve the system. Use your work to 
state the general solution y to the original equation. 


17. Convert the third-order differential equation y’”” — y’ = 0 to a system of 
first-order equations and solve the system. Use your work to state the 
general solution y to the original equation. 
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3.4 Systems with all real linearly independent 
eigenvectors 


In this section, we closely examine the graphical and long-term behavior of 
solutions to 2 x 2 systems in the case where the coefficient matrix A has two real, 
linearly independent eigenvectors. We do so through a sequence of examples 
that demonstrate a variety of possibilities that naturally lead to discussion of the 
stability of equilibrium solutions. 

We first review the graphical behavior of vector functions, a subject 
normally encountered in multivariable calculus. For the system x’ = Ax in 
the case where A is 2 x 2, every solution x(t) is a vector function whose output 
lies in R?. In particular, the graph of x(t) is the curve that is traced out by the 
vectors x(t) at various times t. For example, if 


x(th=e ‘ | +e! H = lees (3.4.1) 


is a function we have found by solving a system of differential equations, then 
evaluating x(t) at t= —1, 0, and 1 yields the vectors 


“(=~ lea xto\= H and x0 bes (3.4.2) 


Plotting these vectors helps indicate how x(t) traces out the parametric curve 
given by (x; (t), x2(t)) = (e7‘, e*), shown at left in figure 3.4. 

In addition, it is important to recall the meaning of x’(t), the derivative of 
a vector function. The direction of the vector x’(t) indicates the instantaneous 
direction of motion of a particle traveling along the curve traced out by x(t), 
while the magnitude of x’(t) determines the instantaneous speed of the particle 
at time t. For our purposes, the direction of motion is most important because 


x, l T 


—4 4 


Figure 3.4 At left, the solution curve x(t) given in (3.4.1). At right, the solution curve 
x(t) given in (3.4.1), along with corresponding scaled derivative vectors at times t = —1, 
t=0,andt=1. 
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this indicates a flow along the solution curve as time increases. Thus, rather 
than plotting the vector x’(t) at various times, we plot scaled versions of it, each 
emanating from the tip of x(t). For example, since 


x'(t)= la (3.4.3) 


e 


it follows that 


x!(-1) x ee x/(0)= i]: and x’(1) © Ee (3.4.4) 


Plotting scaled versions of each of these vectors emanating from x(—1), x(0), 
and x(1), respectively, we see the updated image at the right in figure 3.4. 
These plots of the derivative vectors and the flow of the solution curve 
remind us of our earlier work with slope fields for single differential equations. 
Indeed, since a solution curve such as x(t) will always be the result of solving 
some differential equation x’ = Ax, we realize that we have a formula for x’, just 


as we had a formula for y’ in examples like y’ = —2y. In the example discussed 
above, we can view x(t) as being the solution to the system x’ = Ax where A is 
the matrix 
—-1 0 
A= ‘ | (3.4.5) 
so that x’(t) satisfies the equation 
xi (t = 
i =x) =axin)=| ae (3.4.6) 
x3 (t) x2(t) 


In particular, (3.4.6) indicates how, for any point (x, x2) in the plane, we can 
easily compute x’ at that point, and hence know the direction of the flow of 
the solution curve that passes through that point. Using a computer to conduct 
such computations at points sampled throughout the plane (with each resulting 
vector scaled to be of equal length), we get a picture of the so-called direction 
field for the system, shown at left in figure 3.5, which is analogous to a direction 
field for a single differential equation. 

If we now superimpose our plot of the solution curve in figure 3.4 in the 
direction field, now shown on the right in figure 3.5, we see clearly the role that 
the derivative x’ and the direction field play in determining the graph of the 
solution x, as well as the typical behavior of a solution as time increases. 

The x|—x2 plane is usually called the phase plane; note that the independent 
variable t is implicit in the flow, while the behavior of the curve relative to the 
coordinate axes demonstrates the interrelationship between the components 
x,(t) and x2(t) of the solution x(t). Sample solution curves, such the one plotted 
in figure 3.5, are typically called trajectories. Each distinct trajectory is a solution 
to an initial-value problem; the one in figure 3.5 can be viewed as the solution 
to x’ = Ax,x(0)=[1 1]!. 
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Figure 3.5 At left, the direction field for the system x’ = Ax given by (3.4.5). At right, 
the solution to (3.4.5) that is given by (3.4.1). 


We will now explore the direction field, phase plane, and trajectories for 
several examples of 2 x 2 systems of linear differential equations for which the 
coefficient matrix has two real linearly independent eigenvectors. An important 
theme throughout will be the long-range behavior of solutions x(t) as t > oo. 
In addition, we will study the equilibrium solutions of each system; a solution 
x(t) is an equilibrium or constant solution if and only if x(t) is constant for all 
values of t. 


Example 3.4.1 Consider the system of differential equations given by x’ = 


3. 2 : F 
Ax where A = E sl Compute the eigenvalues and eigenvectors of A and 
state the general solution to the system. In addition, determine all equilibrium 
solutions of the system. Finally, plot the direction field for the system, sketch 
several trajectories, and discuss the long-term behavior of solutions relative to 


the equilibrium solution(s). 

Solution. The Maple command >Eigenvectors (A) produces the output 
5{)1 —-1 
1}}1 1 


so that A has eigenvalues 4; = 5 and Az = 1, with corresponding eigenvectors 
v; =[1 1]' and v, =[—1 1]". We therefore know that the general solution to 


x’ = Ax is 
_ stl t|—1 
x(t)=ce 1 + oe 1 


To find the equilibrium solution(s), we seek all constant vectors x that satisfy 
x’ = Ax. In this situation, since x is constant with respect to t, we know that 
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x’ = 0, so therefore we must solve the system of linear equations given by Ax = 0 


where 
3 2 
i > i 


Since det(A) 4 0, it follows that A is an invertible matrix, so the only solution 
to Ax = 0 is x = 0. Thus the system has the origin as its only equilibrium 
solution. 

At the end of this section, in subsection 3.4.1, we will show how to use 
Maple to plot direction fields for systems. In this and subsequent examples, 
well simply provide these plots for discussion. In figure 3.6, we see not only the 
direction field generated by the system, but also the plots of several trajectories, 
which are natural to sketch (even by hand, once the direction field is provided) 
by following the map that the direction field provides. 

Note particularly the straight-line solutions that follow the eigenvectors 
v; = [1 1]! and v2 =[—1 1]". Moreover, since both eigenvalues are positive, 
the respective scalar functions e*’ and e’ both increase without bound as t > oo. 
This explains why the flow along each straight-line solution is away from the 
origin. Indeed, every solution besides the zero solution flows away from the 
equilibrium solution at the origin. 


In chapter 2, we considered single autonomous differential equations such 
as y’ = 2y — 4. When we found equilibrium solutions to such equations, we 
also classified their stability based on the behavior exhibited in the direction 
field. We do likewise with equilibrium solutions for systems. In example 3.4.1, 


Figure 3.6 The direction field for the system 
x’ = Ax of example 3.4.1 along with several 
trajectories. 
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we found that x = 0 is the only equilibrium solution of the system, and that 
every non-constant solution flows away from 0. This shows that 0 is an unstable 
equilibrium, and in this case we naturally call 0 a repelling node. 

We next explore the behavior of a system where both eigenvalues are 
negative. 


Example 3.4.2. Consider the system of differential equations given by x’ = Ax 
where A = iz 4 Compute the eigenvalues and eigenvectors of A, and 
state the general solution to the system. In addition, determine all equilibrium 
solutions to the system. Finally, plot the direction field for the system, sketch 
several trajectories, and discuss the long-term behavior of solutions relative to 


the equilibrium solution(s). 


Solution. Using Maple, we find that A has eigenvalues 4; = —1 and A2 = —4, 
with corresponding eigenvectors v1 = [2 1]! and v2. =[—1 1]!. The general 
solution to x’ = Ax is therefore 


x(th=cqe’ i +oe* | 


To find the equilibrium solution, we set x’ = 0. Solving the system of linear 
equations given by Ax = 0, we see that since A is an invertible matrix, the only 
solution to Ax = 0 is x = 0, so the system has the origin as its only equilibrium 
solution. 

Plotting the direction field and several trajectories, as shown in figure 3.7, 
we observe that all solutions flow towards the equilibrium solution at the origin. 
This makes sense due to the presence of the scalar functions e~*! and e~! in 
the general solution, as each approaches 0 as t — oo, and thus it follows that 
x(t) > 0 as t > oo. Moreover, note the two straight-line solutions that show 
flow along stretches of the two eigenvectors v) = [2 1]' and v2 =[—1 1]". 


Because every non-constant solution to the system in example 3.4.2 approaches 
the equilibrium solution at 0, we say that the origin is a stable equilibrium. 
Moreover, based on the patterns in the flow, we use the terminology that 0 is an 
attracting node. 

We study the third case for a 2 x 2 linear system of differential equations 
with two real, nonzero eigenvalues in the next example: the eigenvalues have 
Opposing signs. 


Example 3.4.3 Let A= > E| and consider the system of differential 
equations given by x’ = Ax. Find the general solution of the system, determine all 
equilibrium solutions to the system, and plot the direction field for the system. 
Include sketches of several trajectories and discuss the long-term behavior of 
solutions relative to the equilibrium solution(s). 
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Figure 3.7 The direction field for the system 
x’ = Ax in example 3.4.2 along with several 
trajectories. 


Solution. We find that A has eigenvalues A; = 2 and Az = —1, with 
corresponding eigenvectors v; = [2 1]! and v7 =[1 2]. It follows that the 
general solution to x’ = Ax is 


x(t) = ce”! H +oe! H 


Since A is an invertible matrix, the only solution to Ax = 0 is x = 0, so the origin 
is only equilibrium solution of the system. 

As figure 3.8 shows, the direction field and various trajectories exhibit a 
different type of behavior around the origin. In particular, solutions that do 
not lie on either eigenvector appear to initially flow toward the origin, and then 
turn away and tend toward the straight-line solution associated with the positive 
eigenvalue. More specifically, it appears that solutions that do not pass through 
a point on the line in the direction of the eigenvector [1 2] are eventually 
attracted to stretches of the eigenvector [2 1]!. This is reasonable since in the 
general solution, e~' will tend to 0 as t — oo, leaving the function c e712 1]? 
to dominate. 


Since some solutions that pass through points near the origin tend away from 
the origin as t — oo, the origin is an unstable equilibrium in example 3.4.3. 
Moreover, as the trajectories remind us of the contour plot in multivariable 
calculus of a surface whose graph looks like a saddle, we say in this context as 
well that the origin is a saddle point. 

The preceding examples demonstrate the three possible cases fora2 x 2 
system with real, nonzero eigenvalues: both positive, both negative, or opposites. 
Our next example investigates the situation when one eigenvalue is zero. 
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Figure 3.8 The direction field for the system 
x’ = Ax of example 3.4.3 along with several 
trajectories. 


—3 

3 
of differential equations x’ = Ax, find the general solution of the system and 
determine all equilibrium solutions. Furthermore, plot the direction field for the 
system along with sketches of several trajectories; discuss the long-term behavior 
of solutions relative to the equilibrium solution(s). 


Example 3.4.4 For the matrix A= mH and the corresponding system 


Solution. We first do the standard computations to find that A has eigenvalues 
Ai = —4 and Az = 0, with corresponding eigenvectors v,} = [-1 1]! and 
v> = [1 3]’. Thus, the general solution to x’ = Ax is 


x(t)=cqe el +o 3 | 


We immediately notice something different about x(t). In particular, because 
the second eigenvalue is 0, the scalar function e has no effect on the general 
solution. Furthermore, with e~** the only part of x(t) that changes with t, we 
can see that for any nonzero constant c; and any c, the graph of x(t) is always 
a straight line where the direction is given by the eigenvector corresponding to 
the nonzero eigenvalue. 

In addition, the presence of a zero eigenvalue has a significant impact on 
the system’s equilibrium solutions. The fact that the columns of A are scalar 
multiples of each other leads us to see immediately that A is not invertible; 
this can be equivalently deduced from the fact that A has a zero eigenvalue. 
The singularity of A further implies that the homogeneous equation Ax = 0 
has infinitely many solutions. In particular, row-reducing the appropriate 
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augmented matrix, we find that 
-3 1 0]_ [1 -1/3 0 
3 -1 0 0 0 0 


This implies that any constant vector x of the form 


( 


satisfies the equation x’ = Ax, and therefore is an equilibrium solution. Note 
especially that x = x;[1 3)" is an eigenvector associated with 4 = 0, and thus 
every eigenvector associated with the zero eigenvalue is an equilibrium solution 
to the system. 

The interesting behaviors that we have discussed algebraically are seen 

in figure 3.9. Specifically, every non-constant solution is a straight line 
solution in the direction of the eigenvector [—1 1]" that is drawn toward an 
equilibrium point that lies on the eigenvector [1 3]' corresponding to the zero 
eigenvalue. 
The flows in figure 3.9, as well as the long-term behavior of the function e~** in 
the general solution x(t), clearly demonstrate that every equilibrium solution 
to the system is stable. Moreover, we say that each such equilibrium point is an 
attracting node. 

There are two important observations to make in closing. One is that we 
still must address the situations where A lacks two real linearly independent 
eigenvectors; we will do so in the next section. In addition, examples 3.4.1—-3.4.4 


Figure 3.9 The direction field for the system 
x’ = Ax of example 3.4.4 along with several 
trajectories. 
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indicate that plotting a direction field is perhaps best left to a computer; however, 
in the case where A has two real, linearly independent eigenvectors, it is a 
straightforward exercise use the eigenvectors to plot these straight-line solutions 
by hand and to use the signs of the corresponding eigenvalues to understand 
the flows along the straight line solutions. Then, it is not difficult to imagine the 
overall appearance of the direction field and sketch several probable trajectories 
by hand, thus fully understanding the graphical behavior of all solutions to the 
system. 


3.4.1 Plotting direction fields for systems 
using Maple 


We again use the DEtoo1s package, and load it with the command 


> with(DEtools): 


To plot the direction field associated with a given system of differential 
equations, we first define the system itself, similar to how we defined a single 
differential equation in order to plot its slope field. We do this through the 


following command for the system with coefficient matrix A = > | from 
example 3.4.1. 


> sys := diff(x(t),t)= 3*x(t)+2*y(t), 
Giff(y(t),t)= 2*x(t)+3*y(t); 


The system of differential equations of interest is now stored in “sys”. While 
we typically use x1(t) and x(t) to represent the component functions in our 
discussion of the theory and solution of systems, in working with Maple it is 
often simpler to use x(t) and y(t). The direction field may now be generated by 
the command 


> DEplot([sys], [x(t),y(t)], t=-1..1, x=-4..4, 
y=-4..4, arrows=large, color=gray) ; 


This command produces the output shown at left in figure 3.10. 


From here, it is a straightforward exercise to sketch trajectories by hand. Of 
course, Maple has the capacity to include trajectories that pass through any 
initial conditions we choose. For example, if we are interested in the various 
initial conditions x(0) = (2,2), (0,4), (4,0), and (—1, 1), we can modify the 
earlier DEplot command to 


> DEplot([sys], [x(t),y(t)], t=-1.6..3.6, x=-4..4, 
y=-4..4, arrows=large, color=gray, [[x(0)=-2,y(0)=0], 
[x(0)=0,y(0)=-2], [x(0)=2,y(0)=0], [x(0)=0,y(0)=2], 
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Figure 3.10 At left, the direction field for the system x’ = Ax of example 3.4.1. At right, 
the same direction field with several trajectories. 
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The results of this most recent DEplot command are shown at right 
in figure 3.10. 

As always, the user can experiment some with the window in which the plot 
is displayed: the range of x- and y-values can affect how clearly the direction field 
is revealed, and the range of t-values determines how much of each trajectory is 
plotted. 


Exercises 3.4 


1. Consider the system of differential equations x’ = Ax given by 


ale | 


(a) Determine the general solution to the system x’ = Ax. 

(b) Classify the stability of all equilibrium solutions to the system. 

(c) Sketch all straight-line solutions to the system and hence plot several 
nonlinear trajectories in the phase plane. 


2. Consider the system of differential equations x’ = Ax given by 
3 1 
he F ‘| 


(a) Determine the general solution to the system x’ = Ax. 
(b) Classify the stability of all equilibrium solutions to the system. 
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(c) Sketch all straight-line solutions to the system and hence plot several 
nonlinear trajectories in the phase plane. 


3. Consider the system of differential equations x’ = Ax given by 


—3 2 
ee 
(a) Determine the general solution to the system x’ = Ax. 
(b) Classify the stability of all equilibrium solutions to the system. 


(c) Sketch all straight-line solutions to the system and hence plot several 
nonlinear trajectories in the phase plane. 


4. Consider the system of differential equations x’ = Ax given by 


—2 0 
a=[> 3 
(a) Determine the general solution to the system x’ = Ax. 
(b) Classify the stability of all equilibrium solutions to the system. 
(c) Sketch the straight-line solutions to the system that correspond to the 


two linearly independent eigenvectors. Why is every solution to this 
system also a straight-line solution? 


5. Consider the system of differential equations x’ = Ax given by 


sce 


(a) Determine the general solution to the system x’ = Ax. 

(b) Classify the stability of all equilibrium solutions to the system. 

(c) Why is every non-constant solution to this system also a straight-line 
solution? How are these straight-line solutions related to the 
eigenvectors of the system? 


In exercises 6-9, let x(t) be the stated general solution to some system x’ = Ax. 
State the straight-line solutions to the system, classify the stability of the origin, 
and sketch some sample trajectories. 


et —5¢|3 

6. x(t) =ce ; + oe 1 
—1 1 

7. x(t) = ce** | i An eget | 


8. x(t) = ce”! |_| +o H 
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a 


12. 


(a) 
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“OH=ae"" H + oe gE 


10. 


For the system x’ = Ax whose general solution is given in exercise 6, 
determine a possible matrix A for the system. (Hint: If A is a matrix with 
all real linearly independent eigenvectors and those eigenvectors are the 
columns of a matrix P, then A satisfies the equation AP = PD, where D is 
the diagonal matrix whose entries are the eigenvalues of A in order 
corresponding to the eigenvectors in the columns of P.) 


. For the system x’ = Ax whose general solution is given in exercise 7, 


determine a possible matrix A for the system. 


Consider the four systems of equations given by x’ = Ax where A is given 
by the matrices I, II, IH, and IV below. Match each system with one of the 
four direction field plots (a), (b), (c), and (d) given below. Write one 
sentence for each to explain the reasoning behind your choice. 


5 3 2-4 7 2 3 
La=[} i waa} | ut. a=|> w.a=|5 =) 


x5 x, 
4 47) 
T 
A 4 a 4 
4 ep “Z| 
XxX, Xx. 
44 das 
x, (d) x, 
T T 
A 4 4 4 
4 =—4~ 
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In exercises 13-17, solve the IVP x’ = Ax with the given matrix A and stated 
initial condition. 


; =| x(0)=[1 2] 


13. A=| 
: 3} x(0) =[—3 1]? 
; Ee x()=f1 -2]"7 
A I x(0)=[-2 —2]" 
2: 2 


7.4=| oy 


i x(0) =[1 4]" 

In exercises 18-22, use the standard substitution to convert the given second- 
order differential equation to a system of two linear first-order equations. Solve 
the system to hence determine the solution y to the second-order equation. 


18. y”—y'-—6y=0 

19. y’ —6y' ++ 5y=0 
20. y”+4y' =0 

21. y+ 3y’+2y=0 


22. y"+y=0 


3.5 When a matrix lacks two real linearly 
independent eigenvectors 


We have seen repeatedly, both in theory and in specific examples, that when a 
2 x 2 matrix A has two real linearly independent eigenvectors, we can determine 
the general solution to x’ = Ax and its graphical behavior. In this section, 
we address two remaining cases: when A has a repeated eigenvalue and only 
one associated real linearly independent eigenvector, and when A has complex 
eigenvalues and eigenvectors. In each case, we work through preliminary exam- 
ples to discover general patterns and principles, expand these principles with 
appropriate theorems, and explore and discuss graphical behavior along the way. 


Example 3.5.1 Consider the system of differential equations given by x’ = Ax 


where A = i 


1 : ' 
0 — i Compute the eigenvalues and eigenvectors of A and 
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explain why this alone does not lead to the general solution of the system. By 
noting that the system is partially coupled, solve the system and determine a 
second real, linearly independent solution. Finally, state the general solution. 


Solution. By inspection, since A is a triangular matrix, we see that 1 = —2 
is a repeated eigenvalue of A with multiplicity 2. From this, we deduce that 
v; =[1 0]' is a corresponding eigenvector, and therefore one solution to 
x’ = Ax is x} = cje~~"[1 0]'. However, A lacks a second linearly independent 
eigenvector associated with A = —2; therefore, we need to find a second real 
linearly independent solution to the system in order to determine the general 
solution to x’ = Ax. In this example, we are fortunate that the system is only 
partially coupled and that therefore we may solve the system directly by using 
techniques for single differential equations from chapter 2. 

In particular, noting that the second equation in the system is x, = —2x, 
it follows immediately that the solution to this single differential equation is 
x(t) = ce~?!, Substituting this result into the equation x; = —2x, + x, it 
remains for us to solve the single nonhomogeneous linear first-order differential 
equation 


xi = —2x, + ce~7# 


Applying our understanding of such equations from section 2.3, via the 
integrating factor v(t) = e?' we know that 


1 
x(t) = a fear Mer +h) 
e 


To summarize, with x; (tf) and x2(t) as the components of x(t), we have found 
that a solution to the system is 


x1(t) 
a= el 
ae ‘| 


ce?! 


(3.5.1) 


If we factor this expression to write x(t) as a linear combination of two vectors 
in order to more clearly identify the role of the constants in (3.5.1), we see 


a! 5 
x(t) =k ° al +c ‘| (3.5.2) 


In this form, two key observations can be made. First, each individual vector 
in (3.5.2) may be verified to be a solution to the given system. Moreover, these 
two vectors are linearly independent. Hence, (3.5.2) is the general solution to 
the given system. 


While it is good that we were able to solve the system in example 3.5.1, it is still 
unclear how we will proceed in similar circumstances when neither equation in 
the system may be solved by techniques for single first-order equations. That is, 
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if the equation for x, involves x, and the equation for x involves x;, but 
the system’s matrix has only one linearly independent eigenvector, we cannot 
employ the approach used in example 3.5.1. However, the general form of the 
solution (3.5.2) can help us guess an appropriate form of the needed second 
linearly independent solution in the more general case. 

Recall that we know that whenever (A, v) is a real eigenpair of A, the function 
x(t) = e*'v isa solution to x’ = Ax, and moreover x(t) isa straight-line solution 
to the system. In example 3.5.1, we found that for the given matrix, which had a 
repeated eigenvalue and only one associated linearly independent eigenvector, 
the scalar function te*! arose in the solution. If we recall that our original work 
with e*'v arose from guessing that a function of the form f(t)v was a solution 
to x’ = Ax, example 3.5.1 now suggests that in the case where we are missing 
an eigenvector, we consider a vector function that somehow involves the scalar 
function te*’ as a second linearly independent solution to x’ = Ax. A closer look 
at (3.5.2) suggests the form of this second solution we seek. 

In particular, recalling that the matrix A in example 3.5.1 had v; = [1 0] 


as the eigenvector corresponding to A = —2, rewriting (3.5.2) reveals the role v; 
plays in the general solution. Specifically, 
x(t) = ke~*! A pee ee A ss ae H (3.5.3) 


and since x;(t) = e~“‘[1 0]! is the standard solution that arises through the 
eigenpair, we see from (3.5.3) that the second linearly independent solution 


has the form te~*‘v + e~*'u, where u is not an eigenvector of A corresponding 
to A = —2. This suggests a form for the second solution when this case arises in 
general. 

We now consider this situation for an arbitrary matrix with the appropriate 
properties. Let A be a 2 x 2 matrix with a single real, repeated eigenvalue A with 
only one linearly independent eigenvector v. Note specifically that we know 
Av = Av and xj(t) = e*'v is a solution to x’ = Ax. Now consider a second 
function 


X(t) = tev + eeu (3.5.4) 
where u is an unknown constant vector and (A, v) remains an eigenpair of A. 
We seek conditions on u that will make x,(t) a solution to x’ = Ax; as we 
have previously encountered in several instances, direct substitution into the 
differential equation reveals the constraints on u. 
First, differentiating (3.5.4) gives 
x5(t)= (Ate! +e w+ rettu (3.5.5) 
Next, observe that multiplying x2(t) by A yields 


Ax>(t) = A(te*'v + e*'u) = te*! (Av) + e*!(Au) (3.5.6) 
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In order for x>(t) to be a solution to x’ = Ax, it follows from (3.5.5) and (3.5.6) 
that we require the equality 


(Ate*! + ew + Ac*u = te*! (Av) + e*!(Au) (3.5.7) 
to hold. Using the fact that Av = Av and expanding, we find 
Ate y + ey + Ae*u = Ate*v + e* (Au) (3.5.8) 
With  te*‘v present on both sides of (3.5.8), we can simplify the equality to 
ety + peu = e*! (Au) (3.5.9) 
Since e*' is never zero, we observe from (3.5.9) that u must satisfy the equation 


v+dAu=Au (3.5.10) 


In other words, (A — AI)u = v, where (as we assumed earlier) v is an eigenvector 
of A that corresponds to the eigenvalue 4. In particular, note that v satisfies 
the equation (A — AI)v = 0. We summarize our work above in the following 
theorem. 


Theorem 3.5.1 If Ais a2 x 2 matrix with repeated eigenvalue A and only one 
corresponding linearly independent eigenvector v, then the general solution to 
x’ = Ax is given by 

x(t) = cqe'v+ oe"(tv+u) 


where u satisfies the equation (A — AITu=v. 


The vector u is often called a generalized eigenvector of A corresponding to A. 
We now demonstrate the role of theorem 3.5.1 in the following example. 


e | and consider the system of differential 


equations given by x’ = Ax. Find the general solution of the system, determine all 
equilibrium solutions to the system, and plot the direction field for the system. 
Include sketches of several trajectories and discuss the long-term behavior of 
solutions relative to the equilibrium solution(s). 


Example 3.5.2 Let A= 


Solution. We find that A has a single repeated eigenvalue 4 = 3 with just one 
corresponding linearly independent eigenvector v = [2 1]'. Thus, one linearly 
independent solution to x’ = Ax is x;(t) = e*’v. Applying theorem 3.5.1, we 
determine a second linearly independent solution to the system. Specifically, 
we first solve the vector equation (A — 3I)u = v. To do so, we row-reduce the 
appropriate augmented matrix and find 


—2 4 2) jl -2 1 
-1 21 0 oO O 


It follows that the vector u must have components 1 and wp that satisfy the 
equation wu, = 2u2 — 1, where wp is a free variable. Since we only need one 


When a matrix lacks two real linearly independent eigenvectors 227 


Figure 3.11 The direction field for the system 
x’ = Ax of example 3.5.2 along with several 
trajectories. 


such vector u, we choose uw = 0 and thus uw; = —1. From theorem 3.5.1, it 
now follows that a second linearly independent solution to x’ = Ax is given 
by the function x2(t) = e (tv + u). In particular, the general solution to 


x’= Axis 
soove foe (ETD 


We note further that since A is an invertible matrix, the only solution to 
Ax = 0 is x = 0, so the origin is the only equilibrium solution of the 
system. 

As figure 3.11 shows, the direction field and several trajectories exhibit 
behavior consistent with the fact that the system has just one straight- 
line solution, the one that corresponds to the single linearly independent 
eigenvector of A. Note as well that since the system’s only eigenvalue is 
positive, every non-constant solution flows away from the origin as t > oo. 


In example 3.4.3, the origin is obviously an unstable equilibrium solution. 
Because there is only one linearly independent eigenvector for the system, we 
call the origin a degenerate node, and in this case where A = 3 > 0 and all the 
trajectories flow away from the origin, this degenerate node is also called a 
repelling node. 

We now consider an example that reveals the other possible situation that 
can arise when a matrix A lacks two real linearly independent eigenvectors: when 
A has no real eigenvalues and no real eigenvectors. 
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Example 3.5.3. Consider the system x’ = Ax given by the matrix 


fy 


Compute the eigenvalues and eigenvectors of A and explain why this does not 
lead directly to the general solution of the system. In addition, plot the direction 
field for the system to confirm these observations from a graphical perspective. 
Using familiarity with solutions to single differential equations and the form of 
the equations for the given system, determine the general solution to the system. 


Solution. The eigenvalues of the matrix A are computed using the 
characteristic equation 


-r -1 


det(A — AI) = aet| 1-2 


=)? +1=0 

We see that A” = —1, so that A = +i, where i is the complex number? i= /—1. 
To determine the eigenvector associated with 4 = i, we solve (A — iI)v = 0. 

Row-reducing the appropriate matrix with complex entries just as we would a 

matrix with real entries, we observe 


ae es a 
1 -i 0 -i -1 0 0 00 


where the first step was achieved by swapping the two rows, while the last step 
was achieved by computing the row replacement iR; + Ro > R2. It follows that 
any eigenvector v associated with 2 = i must have components 1; and v2 that 
satisfy v) = iv2. Choosing v2 = 1, we see that an eigenvector v corresponding to 
= iisv=[i 1]'. Similar computations with 4 = —i show that a corresponding 
eigenvector is v =[—i 1]. While we might suggest at this point that 


is a solution to x’ = Ax, such a solution involves the complex number i, and 
is not a real solution to the system. A plot of the direction field for the system 
reveals further why no real solutions arise directly from the eigenvectors. In 
particular, if we examine figure 3.12, the direction field and various trajectories 
exhibit behavior consistent with the fact that the system has no straight-line 
solutions due to the fact that it has no real eigenpairs: every trajectory appears 
to be circular. 

In this example, we will suspend our work with eigenvalues and eigenvectors 
and see whether we can determine a solution to the system more directly. If we 
examine the two equations given in the system x’ = Ax, we observe that we 


2 A review of key concepts with complex numbers may be found in appendix B. 
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i 


—4 4 


—4 


Figure 3.12 The direction field for the system 
x’ = Ax of example 3.5.3. 


are trying to solve the two equations x; = —x; and x} = x; simultaneously. In 
particular, we seek two functions x(t) and x2(t) such that the derivative of the 
first is the opposite of the second and the derivative of the second is the first. This 
is a familiar scenario encountered in calculus and we recognize that x(t) = cost 
and x2(t) = sint form a pair of such functions. Further consideration reveals 
that the choices x;(t) = —sint and x)(t) = cost also satisfy the system. 

Our recent observations show that the vector functions 


each form a real solution to x’ = Ax; moreover, it is clear that x;(t) and 
x(t) are not scalar multiples of one another, and thus these are two linearly 
independent solutions to the system. Therefore, theorem 3.3.2 implies that the 
general solution to the given system is 


xH=c Be | +o ee | (3.5.11) 


The presence of the sine and cosine functions in the entries of x will also lead to 
the circular trajectories we expect from the direction field in figure 3.12. 


Example 3.5.3 shows several new phenomena. In every preceding example we 
have considered for 2 x 2 systems x’ = Ax, eigenpairs have directly provided at 
least one real solution to the system. But for the latest system we examined, 
the eigenpairs appeared to not produce any solutions to the system at all. 
Moreover, for the first time in our work with linear systems, the sine and cosine 
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functions arose. An important question to consider at this point is whether the 
complex eigenpair 


A=i, v= H (3.5.12) 


can be linked to the general solution that we found in (3.5.11). It turns out 
that the key idea lies in understanding how the exponential function e* behaves 
when the input z is a complex number. 

The great Swiss mathematician Leonhard Euler (1707-1783) is credited 
with discovering Euler’s formula, which states that for any real number ¢, 


elt =cost+isint (3.5.13) 


In exercise 14 in this section, one way to derive Euler’s formula through Taylor 
series for the exponential and trigonometric functions is explored. For now, we 
will simply accept (3.5.13) and put it to use. 

Using the first complex eigenpair found in example 3.5.3, let us consider 
the standard form of a potential solution to x’ = Ax, x(t) = ety, using the 
eigenpair identified in (3.5.12). Here, since the solution we are considering is in 
fact complex, we will use the notation z(t). Using Euler’s formula and complex 
arithmetic, observe that 

it | # 
z(t)=e H 


= (cost + isint) H 


__ | icost —sint 
~ | cost+isint 


When working with complex numbers, it is often useful to identify the real and 
imaginary parts of the numbers. That is, for a complex number z = a+ ib where 
a and bare real, we call a the real part of z, and b the imaginary part of z. The 
same distinctions hold for vectors with complex entries. Considering (3.5.14), 
if we separate this vector into its real and imaginary parts, we may write 


—sint .| cost 

z(t) = cae | +i & | (3.5.15) 
If we now compare the general solution to x’ = Ax that we found in (3.5.11) 
to (3.5.15) above, we can make a critical observation. The two linearly 
independent solutions to the system seen in (3.5.11) are in fact the real and 
complex parts of the vector z(t) which arose from considering z(t) = ely 
where (A, v) was a complex eigenpair of A. That this fact holds in general is our 
next stated theorem. 


(3.5.14) 


Theorem 3.5.2. If Ais areal 2 x 2 matrix with a complex eigenvalue 4 = a+ ib 
and corresponding eigenvector v = p + iq, where a, b, p, and q are real, then 
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the real and imaginary parts of 


z(t) = elatbii cy 4 iq) 


are real linearly independent solutions to x’ = Ax. 


We proceed to apply this result in another example involving complex 
eigenvalues and eigenvectors. 


Example 3.5.4 Let A= E | and consider the system of differential 


2 -1 

equations given by x’ = Ax. Find the general solution of the system, determine all 
equilibrium solutions to the system, and plot the direction field for the system. 
Include sketches of several trajectories and discuss the long-term behavior of 
solutions relative to the equilibrium solution(s). 


Solution. For matrices with complex eigenvalues, Maple provides an efficient 
and valuable approach: the program completes the necessary complex arithmetic 
automatically and produces the results we need. Doing so, we find that A has 
complex eigenvalues 1 = —1 + 21 with corresponding complex eigenvectors 
v = [+i 1]'. We choose one of these complex eigenpairs and consider the 


complex function 
a(t) = e120" H 


Observe that e(—!+?)" — e~'¢?", so by Euler’s formula 
e(142it e‘(cos2t+isin2t) 


Substituting this fact into z(t), we observe that 


z(t) =e ‘(cos2t + isin2t) H 


_;|—sin2t+icos2t 
cos2t+isin2t 


+] —sin2t] | . +] cos2t 

= ae om | 
By theorem 3.5.2, it now follows that the real and imaginary parts of z(t) form 
two real linearly independent solutions to x’ = Ax, and therefore the general 
solution to x’ = Ax is 


x(t)=cje! asa + oe? ee (3.5.16) 


cos2t sin 2t 


Since A is an invertible matrix, the origin is the only equilibrium solution 
of the system. Finally, as figure 3.13 shows, the direction field and plotted 
trajectories exhibit behavior consistent with the fact that the system has no 


232 Linear systems of differential equations 


| 
L 


Figure 3.13 The direction field for the system 
x’ = Ax of example 3.5.4 along with several 
trajectories. 


real eigenvectors and therefore no straight-line solutions. Moreover, since the 
real part of A = —1-+ 2/is negative, the role of e~‘ in the general solution (3.5.16) 
draws every solution to 0 and thus the origin is a stable equilibrium. 


In cases such as the one in example 3.5.4 where there are no straight-line 
solutions and every nonconstant solution tends to 0 as t > oo, we naturally 
say that 0 is a spiral sink. Note that this case corresponds to the situation where 
the real part of a complex eigenvalue is negative. If the real part a of A = a+ Diis 
positive, then we will have e“ present in the general solution, and this will drive 
every solution away from the origin. We therefore call 0 a spiral source and note 
that this equilibrium solution is unstable. Finally, in the event that a = 0 in the 
complex eigenvalue A = a+ bi, as it was in example 3.5.3, then all nonconstant 
solutions will orbit the origin while neither being drawn toward or repelled from 
the equilibrium solution. See, for example, figure 3.12. Such an equilibrium is 
called a center and is considered stable. 

In our discussions in this section we have addressed the two possible cases 
for a 2 x 2 matrix A which lacks two linearly independent eigenvectors. Our 
work extends naturally to the case of more general n x n systems where the 
n X n matrix A may or may not have 7 real linearly independent eigenvectors. 
Of course, in the case where A has a full set of n real linearly independent 
eigenvectors, the eigenpairs allow the general solution to the system to be 
determined. In cases where some of the eigenvalues are complex, or repeated 
with missing eigenvectors, we can work with each individual eigenvalue to build 
real linearly independent solutions in ways similar to our preceding work. Some 
examples are explored in the exercises that follow. 
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Table 3.1 
The stability of the origin as determined by the eigenvalues of a 
2 x 2 matrix A 


0<A) <A2 0 is unstable and called a repelling node 
Ay <0<d)A2 0 is unstable and called a saddle 

Ay <A2 <0 0 is stable and called an attracting node 
A=axtbianda>0 0 is unstable and called a spiral source 
A=a+tbianda=0 0 is stable and called a center 
A=axtbianda<0 0 is stable and called a spiral sink 


We close this section with a summary in table 3.1 of the stability of the 
origin as an equilibrium point of x’ = Ax in the cases where both eigenvalues 
are nonzero. 


Exercises 3.5. For each of exercises 1-7, the general solution x(t) to a 
homogeneous linear 2 x 2 system of differential equations x’ = Ax is given. 
For each problem, sketch any straight-line solutions, classify the stability of the 
equilibrium solution x = 0, and sketch a few trajectories that are not straight 
lines. Do not use a computer. 


—l 1 
1. x(t) =cje~# | + ee ;] 


2. x(t) = cje72! be | + qe 7# E = | 


sint cost 


3. x(t) = ce?" zl +oe' H 


7. x(t) = ce”" H + ce! |; 


For each of exercises 8-13, the characteristic polynomial p(A) of a matrix A 
is given. That is, the zeros of the given polynomial are the eigenvalues of 
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the matrix A. For each, classify the stability of the origin as an equilibrium 
point of the system given by x’ = Ax. 


8. p(A) =A? —4 

9. p(A) =A? +4 

10. p(A) =A? 4441 

11. p(A) =’? — 10049 
12. p(A) =A? —2A45 
13. p(A) =A? 4+3A42 


14. Recall or look up the formulas for the Taylor series about a = 0 for each of 
the functions e*, sinx, and cos x. Assuming that the Taylor series for e* is 
valid for complex numbers x, compute e!” and compare the result to the 
expansions for cos b and isin b to show that 


b_ cosh+isinb 


e! 
In addition, show that 
et tib — 69 (cosh + isin b) 
In exercises 15-19, a matrix A is given. For each, consider the system of 
differential equations x’ = Ax and respond to (a) - (d). 


(a) Determine the general solution to the system x’ = Ax. 
(b) Classify the stability of all equilibrium solutions to the system. 


(c) How many straight-line solutions does this system of equations have? 
Why? 


(d) Use a computer algebra system to plot the direction field for this system 
and sketch several trajectories by hand. 


16. A ; | 
7.4=[" 4 
18. A=[7§ | 
9.a=|) fF 
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In exercises 20-24, solve the IVP given by x’ = Ax and the stated initial condition. 


_ {0 —2 = T 
20.4=| al x(0) =[1 3] 


_|2 —3 _ T 
n.a=|3 AF x(0) =[-3 1] 


oe ee = T 
2.A=| : | x(0)=[2 —2] 
4 
5 


i} x(0)=[-2 —3]" 


=|? =4 _ T 
244-7 ‘i x(0) =[0 5] 


25. Consider the system of differential equations x’ = Ax given by 


3 1-1 
A= 1 3 1 
=1 cl 3 


(a) Determine the general solution to the system x’ = Ax. 
(b) Classify the stability of all equilibrium solutions to the system. 


(c) How many straight-line solutions does this system of equations have? 
Why? 


26. Repeat exercise 25 using the matrix 


Oo 37% =12 
A=|-1 -3/2 3/2 
=] 12 =12 


27. Explain why every 3 x 3 homogeneous linear system of differential 
equations of the form x’ = Ax must always have at least one straight-line 
solution. Must every 4 x 4 system have at least one straight-line solution? 
Explain. What can you say about any n x n homogeneous linear system? 


In exercises 28-32, use the standard substitution to convert the given second- 
order differential equation to a system of two linear first-order equations. Solve 
the system to hence determine the solution y to the second-order equation. 


28. y’ + y'—6y =0 
29. y"” + 2y/ + 5y =0 
30. y” +4y =0 

31. y” +3y’ — 28y =0 
32." Fyt1=0 
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3.6 Nonhomogeneous systems: undetermined 
coefficients 


So far in our studies of systems of linear differential equations, we have focused 
almost exclusively on the case where the system is homogeneous and can be 
represented in the form x’ = Ax. We now begin to investigate nonhomogeneous 
systems, which are systems of the form x’ = Ax + b where b £0. 

In section 3.1, we encountered a system of two tanks where we were 
interested in the amount of salt in each tank at time ft. With the amount of 
salt in the two tanks represented respectively by x; (t) and x2(t), we saw that 
these component functions had to satisfy the system of differential equations 


given by 
,__|—1/20 1/80] | x1 20 
= -| 1/40 —1/40]}x.}*|35 eel) 
and that this system is naturally represented in the form 


x’=Ax+b (3.6.2) 


In our most recent work with the homogeneous equation x’ = Ax, we noted 
several times the analogy to solving the single first-order differential equation 
x’ = ax. In particular, we observed the key role that e*! plays in the process of 
solving homogeneous systems of equations, much like e” does in the solution 
of a single homogeneous linear first-order equation. 

We next naturally consider the linear first-order analogy of (3.6.2), 
a nonhomogeneous equation such as 


y =2y+5 (3.6.3) 


In section 2.3, we made the observation in theorem 2.3.3 that for any linear 
first-order differential equation in the form 


y+ p(t)y =f(t) 


if yp is any solution to the nonhomogeneous equation and y;, is a solution to 
the corresponding homogeneous equation, then y = yp + yp is a solution to the 
nonhomogeneous equation. 

In our studies of linear algebra in chapter 1, we made a similar observation 
in section 1.5: if we have a solution xp to the nonhomogeneous equation Ax = b, 
and we add to Xp any solution x;, to the homogeneous equation Ax = 0, the result 
(x = Xp + x;) is also a solution to Ax = b. See (1.5.1) to revisit the details of this 
discussion. Note that in this purely linear algebra context, x is a vector whose 
entries are constant. 

These two preceding observations for linear first-order differential equa- 
tions and systems of linear algebraic equations are now applied to the 
nonhomogeneous system of linear first-order differential equations, x’ = Ax +b. 
We note specifically that in this context, x(t) is a function of ft. Let’s return to 
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the known situation of the homogeneous system x’ = Ax and denote its solution 
by x;,(t). In addition, suppose we are able to determine a single solution x,(t) 
to the nonhomogeneous equation x’ = Ax + b. We claim that the function 
x(t) =xp,(t) + Xp(t) is the general solution of the nonhomogeneous equation. 
To see this, we substitute directly into x’ = Ax + b and verify that the equation 
is satisfied. By properties of linearity, observe that 


x'(#) =x), (2) +x,(t) (3.6.4) 
and furthermore 
Ax + b= A(x), + Xp) +b = Ax, + Axp +b (3.6.5) 


By how we defined x;(t) and x(t), we know that x;,(t) = Ax,(t) and 
xp(t) = Ax,(t)+b, and thus (3.6.5) implies 


Ax +b=x,(t) +x,(t) (3.6.6) 


From (3.6.4) and (3.6.6), we see that x = x; + Xp is indeed a solution to 
x’ = Ax-+ b. In fact, we have found the general solution to the nonhomogeneous 
system, as stated in the following theorem. 


Theorem 3.6.1 Let A be an n x n matrix with constant coefficients. If x), is 
the general solution to the homogeneous system x’ = Ax and Xp is any solution 
to the nonhomogeneous system x’ = Ax + b, then x = xj, + Xp is the general 
solution to x’ = Ax +b. 


Theorem 3.6.1 provides an approach that will guide us throughout our 
efforts to solve nonhomogeneous systems of differential equations. First, we 
solve the associated homogeneous system to find x;,, a process we are familiar 
with. We usually call x;, the complementary solution to the equation x’ = Ax +b. 
Next, we must find a so-called particular solution xp to the nonhomogeneous 
system x’ = Ax + b. Although a more sophisticated approach will be introduced 
in the next section, for now we will investigate a few examples in which the 
process of finding such a particular solution x, is relatively straightforward. 


Example 3.6.1 From the system of two tanks discussed in sections 1.1 and 3.1, 
consider the nonhomogeneous system of linear differential equations given by 


r —1/20 1/80 20 
* -| 1/40 aa *+[35| (3.6.7) 


By solving the associated homogeneous system and determining a particular 
solution to the nonhomogeneous system, find the general solution to the given 
system. In addition, plot an appropriate direction field and discuss the long- 
term behavior of solutions and their meaning in the context of the salt in 
each tank. Determine and sketch the solution to the IVP with initial condition 
x(0) = [2000 1000]". 
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Solution. We begin by solving x’ = Ax, where 


Pe —1/20 1/80 
~! 1/40 —1/40 
The eigenvalues of A are approximately A; = —0.158 and Az = —0.592, 
with corresponding eigenvectors approximated by v; = [0.366 1.000]' and 
v> = [—1.366 1.000]. It follows that the general solution x;, is 


—  -0.158r | 0.366 —0.592t | — 1.366 
xnlt) = c1e ke ee 1.000 


Next, we must determine a particular solution Xp to the nonhomogeneous 
equation x’ = Ax + b. In this particular example, b is a constant vector. 
Therefore, it is natural to guess that a constant vector xp will satisfy the 
nonhomogeneous equation. More than this, we should recall from earlier 
discussions of the problem leading to the given system that the vector x 
represents the amounts of salt in two connected tanks as streams of inflow 
deliver salt, each at a constant rate. Our intuition suggests that over time the 
two tanks should approach a stable equilibrium, and hence an equilibrium (and 
therefore constant) solution should be present. 

Therefore, we assume that xp is a constant vector and observe that this 
immediately implies that x, = 0. Substituting into x’ = Ax + b, it follows that 
Xp must satisfy the system of linear equations 0 = Ax, + b or Axp = —b. With the 
given entries of A and b, this leads us to row reduce the appropriate augmented 
matrix and find that 


1/20 1/80 —20]_ [1 0 1000 
1/40 —1/40 —35 0 1 2400 


This shows xp = [1000 2400]! isa particular solution to x’ = Ax +b, and, more 
specifically, is an equilibrium solution of the system. Moreover, it now follows 
that the general solution to the system is given by 


7 0.1582 [0.366 _0.592¢ | —1.366 1000 
X(t) = xp(t)+xp(t) = ce 000 | +22 ae onal 


(3.6.8) 
If we add the initial condition that x(0) = [2000 1000]', we can solve for the 
constants c, and c, and plot the appropriate corresponding trajectory, as shown 
in figure 3.14. In both (3.6.8) and figure 3.14 we can see how the long-term 
behavior of every solution tends to the equilibrium solution. Moreover, in the 
direction field we can also recognize the straight-line solutions that correspond 
to lines in the direction of each eigenvector but that now pass through the 
equilibrium solution (1000, 2400). 


From example 3.6.1, we observe that in cases where we want to solve x’ = Ax +b 
and b is itself a constant vector, xp may be determined by assuming that xp is a 
constant vector and solving 0 = Axp + b. If Xp is not constant, then the situation 
is more complicated, as we discover in the following example. 
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Figure 3.14 The direction field for the system x’ = Ax + b of example 3.6.1. 


Example 3.6.2 Find the general solution of the nonhomogeneous system 
given by 
,_|2 -l cos2t 
x= ; = x+| 0 (3.6.9) 


Solution. Since the eigenvalues of A= E _ i are A, =—landdA,2=1 with 


corresponding eigenvectors v, = [1 3] and v. =[1 1]', it follows that the 
complementary solution to the related homogeneous system is 


x,=cje : + oe ; 
3 1 


To determine the particular solution xp to the given nonhomogeneous 
system, we need to find a vector function x(t) that simultaneously satisfies 
the system (3.6.9). Due to the presence of cos2t in the vector b, it is natural 
to guess that the components of x, will somehow involve cos2t. In addition, 
since Xp plays a role in the system, we must account for the possibility that the 
derivative of cos2t may also arise; moreover, since Ax will also be computed, 
linear combinations of vectors that involve the entries in x will be present. 
Therefore, we make the reasonable guess that xp has the form 


= Ree Hie (3.6.10) 


ccos2t+ dsin2t 


and attempt to determine values for the undetermined coefficients a, b,c, and 
d that make x, a solution to the system. 


We accomplish this by direct substitution into (3.6.9). First, observe that 


eee + eee 


—2csin2t+2dcos2t (3.6.11) 


x = 
p= 
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Now substituting (3.6.10) and (3.6.11) into (3.6.9), it follows 
—2asin2t+2bcos2t}|  |2 —1]}| acos2t+ bsin2t ie cos2t 
—2csin2t+2dcos2t|~ |3 —2]|} ccos2t+dsin2t 0 

If we now expand the matrix product and factor out the terms involving sin 2t 

and cos 2t on the right side, 

—2asin2t + 2bcos2t = (2b— d)sin2t+ (2a—c+1)cos2t (3.6.12) 
—2csin2t+2dcos2t = (3b— 2d) sin2t + (3a — 2c) cos2t (3.6.13) 


In (3.6.12), we can equate the coefficients of sin2t to find that —2a = 2b—d. 
Doing likewise for the coefficients of cos2t, 2b = 2a — c + 1. Similarly, (3.6.13) 
results in the two equations —2c = 3b — 2d and 2d = 3a — 2c. Reorganizing 
these four equations in four unknowns, we see that a, b,c, and d must satisfy 
the system 


—2a—2b+d=0 
—2a+2b+c=1 
—3b—2c+2d=0 
—3a+2c+2d=0 


Row-reducing, 


-2 -2 010 10 0 0 ~2/5 
= 2 bo i],[oie a5 
0-3 -2 2 0 00 10 —3/5 
}-3 0 220] loo01 0 


which shows a = —2/5, b= 2/5, c= —3/5, and d = 0, so a particular solution 
to the nonhomogeneous system is 


—2cos2t+ 2 sin 2t 
.. 
, —2cos2r 


Finally, it follows that the general solution to the system is 


= “[3]+ Te ee 
X=Xp+Xp=cye me " 
3 1 —5cos2t 


One lesson to take from example 3.6.2 is that while the process for trying 
to solve a nonhomogeneous system of differential equations is straightforward, 
the actual computation of a particular solution xp, can be quite cumbersome. 
Indeed, even in the case where the vector b is quite simple, as it is in the most 
recent example, tedious calculations can arise. Moreover, it is less clear how one 
might proceed in the situation where the vector b is particularly complicated. 
Specifically, making an appropriate guess for xp may be difficult. We usually 
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call the process of finding xp through a guess involving unknown constants the 
method of undetermined coefficients. 

To gain a better sense of the guesses that are involved in using undetermined 
coefficients, we turn to the following example. 


Example 3.6.3 For nonhomogeneous linear systems of the form x’ = Ax +b 
where A is a matrix with constant entries, state the natural guess to use for xp 
when the vector b is 


—t 2 —3t 
=| | w=" cov=|'| @v=(") | 


Solution. 
(a) With b=[e~? 2e~*]', it is natural to expect that any particular solution 
must involve e~‘ in its components. Specifically, we make the guess that 


_ [Ae 
*p =| Be-t 


and substitute directly into x’ = Ax + b in order to attempt to find values 
of A and B for which xy satisfies the given system.? 

(b) Given b=[1 t]!, we must account for the fact that Xp and its derivative 
can involve constant and linear functions of t. In particular, we suppose 


that 
_ | At+B 
sae re + ll 
and substitute appropriately in an effort to determine A, B, C, and D. 
(c) For b=[t? 0]', with one quadratic term present in b, it is necessary to 
include quadratic terms in each entry of xp. But since the derivative of xp 
will be taken, linear terms must be included as well. Finally, once linear 


terms are included, for the same reason we must permit the possibility that 
constant terms can be present in xp. Therefore, we guess the form 


_ [At +Bt+C 
*p =| p+ Et +F 


(d) With b= [e~>! —2]" having both an exponential and constant term 
present, we account for both of these scalar functions and their derivative 


by assuming that 
_ [ Ae? +B 
72 =| Ce-3t 4D 


3 It is possible that the guess can fail to work, in which case a modified form for xp is required. One 
setting where this may occur is when A = —1 is an eigenvalue of A, whereby a vector involving e~' 
already appears in the complementary solution xj. See exercise 8 for further investigation of this 
issue. 
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The method of undetermined coefficients is not foolproof: it is certainly possible 
to guess incorrectly (as noted in the footnote related to part (a) of example 3.6.3). 
If our guess is incorrect, an inconsistent linear system of algebraic equations will 
arise, which tells us we need to modify our guess. Besides the possibility of 
guessing incorrectly, it can also be the case that the computations involved in 
determining x, are very cumbersome. In the next section, we consider a different 
approach, one that parallels our solution of single linear first-order differential 
equations of the form y’ + p(t)y = f(t), that provides, at least in theory, an 
algorithmic approach to solving any nonhomogeneous system x’ = Ax + b 
where the matrix A has real, constant entries. 

Finally, we note that the presence of nonconstant entries in the vector b in 
a nonhomogeneous system x’ = Ax + b makes it impossible to plot a direction 
field for the system. In particular, when we sketch direction fields, we rely on 
the fact that regardless of time, f, the direction vector x’ to the solution curve x 
is dependent only on the location (x,, x)), and not on t. When b is nonconstant 
and a function of t, this is no longer the case and we therefore are left with only 
algebraic approaches to the problem. If b is constant, then we can generate the 
direction field for the system, such as the one shown in figure 3.14. 


Exercises 3.6 In each of exercises 1-4, show by direct substitution that the 
given particular solution xp is indeed a solution to the stated nonhomogeneous 
system of equations. Hence determine the general solution to the stated system. 


; —1 3 5 —4 
ef deL} 
; | 2 si +[—1/3 
2.x =|3 ‘ls | "op =e | Al 
3.x/= ; | x+ el Xp= sint a + cost es 


2 (=3> Jl r+] — [1]. [3/14 
ax'=[ 1 teal yf Pa lajia 


5. Consider the system of differential equations 


ae 


(a) Explain why it is reasonable to assume that xp is a constant vector, and 
use this assumption to determine a particular solution to the given 
nonhomogeneous system. 

(b) Determine the complementary solution x; to the associated 
homogeneous system, x’ = Ax. 

(c) State the general solution to the system. 

(d) Is there an equilibrium solution to this system? If so, is it stable? 
Explain. 
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6. Consider the system of differential equations 


ae ae ett 
a i+ 0 


(a) Explain why it is reasonable to assume that xp is a vector of the form 


aett 
Xp =| pest 
Then use this assumption to determine a particular solution to the 
given nonhomogeneous system. 
(b) Determine the complementary solution x; to the associated 


homogeneus system, x’ = Ax. 
(c) State the general solution to the system. 


7. Consider the system of differential equations 


11 et 4] 
a 
— i ts+ pet 
(a) Explain why it is reasonable to assume that xp is a vector of the form 
a ae** +h 
P| ce? +d 
Use this assumption to determine a particular solution to the given 
nonhomogeneous system. 
(b) Determine the complementary solution x; to the associated 


homogeneus system, x’ = Ax. 
(c) State the general solution to the system. 


8. Consider the system of differential equations 


(a) Explain why it is reasonable to assume that xp is a vector of the form 


ae—* 
*e =| be-t 
(b) Show that the form of xp above does not result in a particular solution 


to the system. 
(c) By assuming that xp is a vector of the form 


as ae~' + bte~* 
P| ce-* + dte“* 
determine a particular solution to the given nonhomogeneous system 
(d) Determine the complementary solution x;, to the associated 


homogeneus system, x’ = Ax. 
(e) State the general solution to the system. 
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For the nonhomogeneous linear systems of differential equations given in 
exercises 9-17, (a) determine a particular solution xp by making an appropriate 
assumption about the form of xp, (b) determine the complementary solution 
x), to x’ = Ax, and (c) hence state the general solution to the system. 


10. =| 
u.x'=[ 
12. | 
13. ak “a 
14. | 
15. x'=| 
16. 23) 
nx = ; = hae 


18. For the system of differential equations given in exercise 10, solve the IVP 
with initial condition x(0) =[1 —2]". 


19. For the system of differential equations given in exercise 11, solve the IVP 
with initial condition x(0) =[—3 —2]’. 


20. For the system of differential equations given in exercise 14, solve the IVP 
with initial condition x(0) = [0 4]. 


2 


pan 


. For the system of differential equations given in exercise 15, solve the IVP 
with initial condition x(0) =[1 —2]'. 


22. Without actually computing xp, choose and justify the form you would 
guess for a particular solution to 


— —4 5 —2t-: 1 
x'=| 5 ilete sint|_t| 


Nonhomogeneous systems: variation of parameters 245 


23. Without actually computing xp, choose and justify the form you would 
guess for a particular solution to 


,_ 1-4 5 sin 3t 
a | 5 i}s+ | 
24. Suppose that x; (t) and x(t) are solutions of 
x’ = Ax+ f(t) and x’ = Ax+f,(t) 


respectively. Show that x(t) = x1(t) + x2(f) is a solution of 
x’ = Ax+f,(t)+ fi(t) 


3.7 Nonhomogeneous systems: variation of parameters 


In section 3.6, we discovered that solving the nonhomogeneous linear system 
x’ = Ax + b requires us to find one particular solution xp to the nonhomoge- 
neous system. We then combine this particular solution with the complementary 
solution x;—the general solution to the corresponding homogeneous system 
x’ = Ax. While we were able to successfully solve a range of problems, the 
method of undetermined coefficients is somewhat dissatisfying: essentially we 
made an educated guess as to the form that xp should take, and then substituted 
to see if our guess was appropriate and resulted in a particular solution. As was 
shown in exercise 8 in section 3.6, there are instances when the obvious guess 
fails to work and additional investigation of a possible solution xp is needed. 
Moreover, with undetermined coefficients we only considered functions b(t) 
that had entries that were polynomial, sinusoidal, or exponential in nature. We 
desire a more systematic approach to finding xp; developing such a method is 
the purpose of this section. 

In section 2.3, we learned that for any linear first-order differential equation 
of the form y’ + p(t)y = f(t), the solution y is given by 


pase rO | PO rayn (3.7.1) 


where P(t) = | p(t) dt. We now seek to establish a similar result for the case 
of systems of the form x’ = Ax +b, where A is an 1 x n matrix with constant 
entries and b is a vector function of t. Let us first consider the form of the 
general solution x; to the corresponding homogeneous system. Recall that x = 
CX; +-+++ CX, where {xX),...,X,} is a set of n linearly independent solutions 
to x’ = Ax. 

Being more explicit about the vectors present, say with entries x;(t), we can 
rewrite X = C]X] +---+ C,Xy as 


X11 X12 Xin Cy Xy + X12 + +++ + CnX1n 
x21 x22 x2n C1 X21 + X22 + +++ + Cy X2n 
x=(C : +O : tee + Cy ‘ . 


i ral ee | C1 Xpy + Xy2 +++ + On Xnn 
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Now observe that the right side of the above equation—the overall vector 
formulation of x—can be expressed as a matrix product. In particular, we write 


x= 0C (3.7.2) 


where C is the vector whose entries are the arbitrary constants c),...,c;, that 
arise in the formulation of the general solution x, and ®(t) is the matrix whose 
columns are the n linearly independent solutions to x’ = Ax. We call ®(t) the 
fundamental solution matrix of the system. 

At this point, it is essential to make two observations about ®(t). The first 
is that P(t) is nonsingular for every relevant value of t. This holds because the 
columns of ®(t) are linearly independent since, by definition, they are linearly 
independent solutions of x’ = Ax. Second, we note that ®’(t) = A®(t). Since 
the derivative of @(t) is taken component-wise, this equation is simply the 
matrix way to say that each column of ®(f) satisfies the homogeneous system 
of equations x’ = Ax. 

Now, recall (3.7.2) where we expressed the complementary solution in 
the form x, = ®(t)C. As we now seek a particular solution xp to the 
nonhomogeneous equation, it is natural to suppose that xp has the form 


xp(t) = (t)u(t) (3.7.3) 


where u(t) is a function yet to be determined. We now substitute this guess for 
Xp into x’ = Ax + b(t) to see what conditions u must satisfy. For ease of display, 
in what follows we suppress the “(tf)” notation in each of the functions ©, u, u’, 
and b. By the product rule, 


Xp =(@u) = ou'4+ ou 
and so substituting into the system x’ = Ax + b(t), we have 
®u + &’u=A®u+b (3.7.4) 


Recalling our observation above that ©’ = A®, we can substitute in (3.7.4) 
to find 


eu +Abu=Abu+b (3.7.5) 
We next subtract A®u from both sides of (3.7.5) to deduce that 
®u' =b (3.7.6) 


Since we are interested in determining the unknown function u, and we know 
that ® is nonsingular, we may now write 


u=®!b (3.7.7) 


and, therefore, u must have the form 


u(t) = f © b(e) de (3.7.8) 
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Finally, recalling the supposition we made in (3.7.3) that x» = ®u, (3.7.8) now 
implies 


Xp(t) = ot) | empl dt (3.7.9) 


It is remarkable how this form of xp aligns with our experience with a single linear 
first-order differential equation and the form of its solution given by (3.7.1). We 
summarize our above work in the following theorem. 


Theorem 3.7.1 If A is an m x nm matrix with constant entries, P(t) is 
the fundamental solution matrix of the homogeneous system of differential 
equations x’ = Ax, and b(t) is a continuous vector function, then a particular 
solution xp to the nonhomogeneous system x’ = Ax + b(t) is given by 


Xp(t) = o(t) | © mpen dt (3.7.10) 


The approach to finding a particular solution given in theorem 3.7.1 is often 
called variation of parameters. We next consider an example to see theorem 3.7.1 
at work. 


Example 3.7.1 Find the general solution of the nonhomogeneous system 


given by 
,f2-=1 0 
ie Ela 


Solution. From our determination of the eigenvalues and eigenvectors of the 
same coefficient matrix in example 3.6.2, the complementary solution is 


x,=ce : + oe : 
3 1 


Therefore, the fundamental matrix is 
e! ef 
O(t)= be gi 
According to (3.7.10), we next need to compute ®~!. While the inverse of this 
matrix of functions may be computed by row-reducing [® | I] in the usual way, 


because of the function coefficients in ® it is much easier to use a shortcut for 
computing the inverse of a 2 x 2 matrix that we established in exercise 19 of 


section 1.9. Specifically, if 
a b 
[eal 


is an invertible matrix, then 
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Here, since det(®) = ee’ — 3e~'e! = —2, it follows 


1 e! —et 
—1 Se 
= 2 | se“ “| 


Thus, by (3.7.10), we now have 


Integrating the vector function component-wise by parts and computing the 
subsequent matrix product, 


e* ef 2(t —1)e’ 

x)= pe “| Le i 
_ | 2(¢-1) +2(t +1) 
~ | 6(t—1)+2(t+1) 


= [ae] 


Therefore, the general solution to the original nonhomogeneous system is 
_ 2 -#/ ll dees? 1 - At 
X=Xpt+Xp = ce 3 me 1 8t—4 


Example 3.7.1 demonstrates that there are three key steps in the solution to 
systems of the form x’ = Ax + b(t). The first is solving the related homogeneous 
system x’ = Ax to determine the fundamental solution matrix ®(t). Next, we 
have to compute &~!(t). And finally, we must integrate the vector function 
given by &—!(t)b(t). Since we are seeking just one particular solution Xp, there 
is no need to include the arbitrary constants that arise in antidifferentiating 
®~!(t)b(t). 

We close this section with a second example that shows the computations 
involved when more complicated functions are present in b(t). 


Example 3.7.2 Find the general solution of the nonhomogeneous system 


given by 
pO 1/(e' +1) 
= =, = fe+| 1 
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Solution. We first find x; By finding the eigenvalues and eigenvectors of the 
coefficient matrix A, it is straightforward to show that 


x,=cje! : + oe : 
3 1 


Therefore, the fundamental solution matrix is 
—t ot 
e e 
O(t)= 
( ) 3 et “| 
Moreover, we can show that 


1 e —e! 
= anes 
® (t)= 5 Ee ‘ 


We are now ready to compute xp and write 


x(t) = 010) [ &Vb«e) dt 
et ef 1 o =e 1/(e' +1) 
= lsc! | f -3 ae is] dt 
e* ef eae 
- Ee ff men a 


At this point, it is easiest to use a computer algebra system to integrate and 


complete our calculation of xp. Doing so, and then finding the required matrix 
product, we have 


s ent | $e! — 4 In(e’ + 1) 
Xp(t) = _ 
3e* ef —e'— 34+ 3In(e’ +1) 
ie | 
~ 1 
2 


1 
2 
Ze! In(e! +1)- 3 te! + Ze! In(e! +1) 


Hence, the general solution to the given nonhomogeneous system is 


#1 r| 1 
X= Xp t+Xp = ce 3 + oe i 


e'In(e' +1)— 3 te" + Ze! In(e! +1) 


ul 
2 
- 3e-'In(e’ +1)- 3 tet + Ze! In(e! +1) 


Nie 
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At each stage in applying variation of parameters it is essential to simplify. In 
particular, 6~!(t) should be simplified as much as possible before computing 
©~'(t)b(t), and similarly, { @~!(t)b(t) dt should be simplified as much as 
possible before computing ®(t) f ®~!(t)b(t) dt. One option, of course, is 
to use a computer algebra system to avoid the more tedious aspects of the 
computations. We offer some suggestions for how to use Maple to assist in the 
computations in the following subsection. 


3.7.1 Applying variation of parameters using Maple 


Here we address how Maple can be used to execute the computations in a 
problem such as the one posed in example 3.7.2, where we are interested in 
solving the nonhomogeneous linear system of equations given by 


pelle at 1/(e* +1) 
x= ; = fx+| 1 
As usual, we load the Linear Algebra package. 
> with(LinearAlgebra) : 


Because we already know how to find the complementary solution, we focus 
on determining x, by variation of parameters. First, we use the complementary 


solution, 
1 1 
= —t t 
X,p = ce ; | + oe H 


to define the fundamental matrix ®(t): 
> Phi := <<exp(-t),3*exp(-t)>|<exp(t),exp(t)>>; 


We next use the MatrixInverse command to find ®7! by entering 


> MatrixiInverse(Phi); 


The resulting output is 


We can simplify this result using negative exponents; Maple can do so through 
the following command, through which we also store ®~! in Phi Inv: 


> PhiInv := simplify (MatrixInverse(Phi)); 


Next, in order to compute ®—!(t)b(t), we must enter the function b(t). We 
enter 


> b := <<1/(exp(t)+1),1>>; 
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and then 
Soy 2S simplify(PhilInv.b) ; 


At this point, y is a 2 x 1 array that holds the vector function ®~!(t)b(t). 
Specifically, the output for y displayed by Maple is 


et 


ul 

2ef+1 

—let2+e!) 
2 ef+1 


To access the components in y, we reference them with the commands y [1,1] 
and y [2,1]. In particular, since we have to integrate ®—!(t)b(t) component- 
wise, we enter 


> Y := <<int(y[1,1],t), int(y[2,1],t)>>; 


This last command produces the output 


ee $e ‘— FIn(e! +1) 
| -4 — 3in(et) + 2in(e! +1) 


and obviously stores ®~!(t)b(t) in Y. Note that Maple has not ae a obvious 
simplification In(e’) = tf. Finally, in order to compute ®(t) f ®~'(t)b(t) dt, we 
need to enter Phi . Y. Of course, we again want to simplify, so we use 


> simplify(Phi.yY); 
which produces the output 
—4—$e-'In(e! +1) — Ze" In(e") + Ze In(e! + 1) 
4 _ 3e-tIn(et +1) — Set In(e*) + Se! In(et +1) 


This last result is the particular solution xp to the original system of 
sara equations given in example 3.7.2. Note again that we can 
simplify In(e’) to t in each component. 


Exercises 3.7 
1. Consider the system of differential equations given by 
,_|3 2 5 
aL f+ [A] 


(a) Based on the form of b(t), make a guess and determine xp by 
undetermined coefficients. 
(b) Use variation of parameters to determine xp. 
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2. Consider the system of differential equations given by 


pe 2 et 
x=|> s|=+| 0 


(a) Based on the form of b(t), make a guess and determine xp by 
undetermined coefficients. 
(b) Use variation of parameters to determine xp. 


ies) 


. Consider the system of differential equations given by 


re 2 3e! 
e=[2 ab] 


(a) Based on the form of b(t), what is the natural guess for xp? Show that 
this natural guess fails to work. 

(b) Compute the complementary solution x;, to the stated system and use 
its form to explain why the natural guess in (a) is not a valid one. 

(c) Use variation of parameters to determine Xp. 


a 


. Consider the system of differential equations given by 
,_|O 2 rm A4sint 
*=11 -1]*7 |2sint 
(a) Based on the form of b(t), what would be the natural guess to make 
for xp? How many undetermined coefficients would need to be 


computed? 
(b) Use variation of parameters to determine xp. 


In each of the exercises 5-12, determine the general solution to the given 
system by finding x» using variation of parameters. Note that in each case, ®(t) 
is given. 


re ie 2e! [ze 0 
ee s|=+| il, 2) =|76 est 


1 0 cos2t 20 0 
=I 5 +[asieae 2 =| e! 2 


1 1 e2t e'cost e'sint 
a — 
= —1 i+ i : P(t) = Beet e' cost 
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11 249 e'cost e'sint 
= 
ees i+ a oi | ont e' cost 
0 at et te?! 0 
0 
1 


x+] 1], @O(t)=| 0 ce O 
0 0 ef 


3.8 Applications of linear systems 


In this section, we consider three fundamental physical problems that may be 
modeled and studied using linear systems of differential equations. 


3.8.1 Mixing problems 


Through our study of the motivating example provided at the start of 
chapter 1 and reconsidered at the beginning of the current chapter, we have 
seen that mixing problems naturally lead to nonhomogeneous linear systems 
of differential equations. Below, we examine a slightly more complicated 
example. 

Consider a system of three tanks connected in such a way that each of the 
tanks has an independent inflow that delivers salt solution to it, each has an 
independent outflow (drain), and each tank is connected to the other two with 
both outflow and inflow pipes. The relevant information about each tank is 
given in table 3.2. 

We set up a system of differential equations whose solution represents the 
amount of salt in each tank at time tf and state the system in matrix form. For 
tank A, we denote the amount of salt (in grams) in the tank at time ¢ (in minutes) 
by x1 (f). Similarly, we let x(t) and x3(f) represent the amount of salt in tanks B 
and C. A careful check of the given data shows that for each tank the total rates 


Table 3.2 
Saltwater mixing in three tanks A, B, and C 
Tank A Tank B Tank C 

Tank volume 50 liters 100 liters 200 liters 
Rate of inflow to the tank 2 liters/min 4 liters/min 5 liters/min 
Concentration of salt in inflow | 0.25 g/liter 2 g/liter 0.9 g/liter 
Rate of drain outflow 2 liters/min 4 liters/min 5 liters/min 
Rates of outflows to other tanks | to B: 3 liters/min | to C: 1 liter/min | to A: 4 liters/min 
Rates of outflows to other tanks | to C: 4 liters/min | to A: 3 liters/min | to B: 1 liter/min 
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of inflow and outflow of solution balance so that the volume of solution in each 
tank is constant. 

From the given information on the independent inflow to the tank, we 
know that tank A gains salt at a rate of 


lit 
OL yee Sigg ee. (3.8.1) 


"liter “ min min 
Furthermore, tank A also gains salt from the two inflows that come from tanks 
B and C. For tank B, which contains 100 liters of solution, solution flows to A 


at a rate of 3 liters/min with a concentration of x2(t)/100 g/liter, so that salt is 
gained by tank A at a rate of 


x g liters 3x g 


eee aS : (3.8.2) 
100 liter min 100 min 


Similarly, the flow from tank C to tank A results in A gaining salt at a rate of 


x3 og liters 3 g 


— = : (3.8.3) 
200 liter min 50 min 

Tank A is also losing salt through its three outflows: a drain, flow to tank 
B, and flow to tank C. Since the concentration of solution in tank A at time 
t is x,(t)/50 g/liter, it follows that each outflow carries this concentration of 
salt, doing so at respective rates of 2 liters/min, 3 liters/min, and 4 liters/min. 
This shows that solution is leaving tank A at a cumulative rate of 9 liters/min, 
therefore causing the rate at which salt is lost from tank A to be 


x, g liters 9x, g (3.8.4) 


50 liter ~ min 50 min 
Combining the rates of inflow and outflow in (3.8.1), (3.8.2), (3.8.3), and (3.8.4), 
it follows that x; (t) satisfies the differential equation 
3x. 4x3 Ox], 


f—05+—=4—-——— 3.8.5 
“1 * To0 7 300 50 ae) 


Similar reasoning shows that x(t) and x3(t) satisfy the differential 
equations 


3x] x3 8x2 
/ 
Biggs ae 3.8.6 
2=8+ > + 500” T00 oye) 
and 
4x) x 10x3 
£=45 = 3.8.7 
*3 +30 tT Too 200 ay 


Rearranging (3.8.5), (3.8.6), and (3.8.7) and writing the system they generate in 
matrix form, we see 


—9/50 3/100 1/50 0.5 
x’=| 3/50 —2/25 1/200/x+]| 8 (3.8.8) 
2/25 1/100 —1/20 4.5 
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We can easily determine the equilibrium solution to the system by 
setting x’ = 0 and row-reducing the resulting linear system of equations. 
Doing so results in 


—9/50 3/100 1/50 —0.5 100 
3/50 —2/25 1/200 -8|>]0 1 0 150 
2/25 1/100 —1/20 —4.5 001 


so that x; = 50, x. = 150, x3 = 200 is the only equilibrium solution to the 
system. In addition, the eigenpairs of the coefficient matrix A are approximately 
A = —0.030, —0.204, —0.076 and v = [0.203 0.346 ifs, [—2.041 0.949 i, 
[—0.168 —1.250 1]'. Since all three eigenvalues are real and negative, we 
can conclude that the above equilibrium is a stable attracting node. Moreover, 
we can determine the general solution to the system. The eigenvalues and 
eigenvectors provide us with x, the complementary solution, while x, is given 
by the equilibrium solution so that 


0.203 —2.041 
x(t) = cre 9" | 0.346 | + oe 924" | 0.949 
1 1 
—0.168 50 
+ c,e~976F | 1.250 | +] 150 
i 200 


We conclude from this example that three connected tanks generate a natural 
example ofa linear system of nonhomogeneous differential equations. Certainly, 
we can envision similar ideas being applied to more complicated scenarios, such 
as the spread of a pollutant through a connected chain of rivers and lakes. 


3.8.2 Spring-mass systems 


In section 3.1, we developed the linear second-order differential equation that 
governs the behavior of a spring-mass system and converted the equation to a 
system of two first-order equations. In particular, we learned that for a system 
with mass m, spring constant k, damping constant c, and driving force F(t), the 
displacement y(t) of the mass from its equilibrium position satisfies the DE 


c k 1 
y"+—y +—y=— Fit) (3.8.9) 
m m m 
Moreover, using the substitution x, = y and x, = x; = y’, it follows that (3.8.9) 
can be represented by the system 
x} =x2 
(3.8.10) 
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/ 7 
m, equilibrium m, equilibrium 


Figure 3.15 Two masses m, and my joined 
by two springs, at equilibrium. 


Next, we consider the more complicated case of a system involving two 
masses and two springs, but omit damping and driving forces. In particular, 
suppose that a mass m] is attached to a spring with spring constant k; and that 
from mj a second spring with constant ky and mass mp is attached, as shown 
in figure 3.15. While we represent the masses with boxes, for our theoretical 
work we assume we are working with point-masses, where all of the mass is 
concentrated at a single point. We can envision these points as lying at the 
centers of the respective boxes in figure 3.15. 

To omit damping, we assume that the surface on which the masses rest is 
frictionless. In addition, once the masses are set in motion by some collection 
of initial displacements and velocities, we let x,(t) denote the displacement of 
m, from its equilibrium position and x2(t) the displacement of m2 from its 
equilibrium position and set the system in motion, as shown in figure 3.16. 

We seek a system of first-order differential equations that models this 
situation. Note that m) has two springs attached to it, so each spring exerts 
forces on mj. One is F; = —k,x,, which is the force the first spring exerts to 
oppose the displacement of the first mass. Next, observe that when the system is 
at equilibrium, the distance between the two masses is some constant L. Once 
the system is set in motion, the distance between the two masses is L+ x2 — x,. 
As such, the second spring is being stretched a length of x, — x; beyond where 
it is when the system is at equilibrium. On mass m, this exerts a force in the 
opposite direction of F,, specifically the force Fy = ky(x, — x,) on m,. On the 
second mass 1m there is only this same force exerted by the second spring, but 


in the opposite direction as on m,. In particular, F; = —k)(x2 — x) acts on mp. 
L 
a a 


Figure 3.16 Two masses m, and m and two 
springs displaced from equilibrium. 
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Now, because we have omitted damping and forcing, these are the only 
forces acting on m, and m2. Newton’s second law tells us that the sum of all 
forces acting on an object must equal the object’s mass times its acceleration. In 
particular, we have 


mx}, = —ky x + ko(x2 — x1) 
mx = —ky (x, — x1) 


Dividing through by m, and my, respectively, these observations lead us to the 
system of linear second-order differential equations 


k k 
= —— ny + (xy — 1) 

is as (3.8.11) 
> = -—(y— m1) 


To study the behavior of this system with the techniques that we have developed, 
we must convert each of the second-order equations to a system of two first- 
order equations. Before doing so, we introduce specific numerical values for the 
masses and spring constants to simplify our work. We let kj = 2 and ky = 1, and 
m, = 2 and my = 4. This yields the system 


x = —x1 +0.5(x2 — x1) 
(3.8.12) 
xy = —0.25(x% — x1) 


Using the substitutions y, = x1, y2 = y; = x}, ¥3 = 2, ya = y5 = X;, it follows 
that (3.8.12) results in the system of four first-order equations given by 


y=” 

5 = —y +0.5(y3 — 
Y, y1 (v3 —y1) Cee 
V3 = V4 


yy = —0.25(y3 — 1) 


Letting y be the vector [y; y2 y3 ya]', we can write (3.8.13) in matrix form, 


; 01 0 | 

NSIS CO 10550 

y= 0 0 0 11 (3.8.14) 
0.25 0 —0.25 0 


From this, we can now analyze the overall behavior of the coupled spring-mass 
system. In particular, the eigenvalues and eigenvectors of the coefficient matrix 
in (3.8.14) will enable us to find the general solution y. Given initial conditions, 
we can fully describe the functions y;(t)—particularly y; and y3, which represent 
the respective displacements of the masses in the system—and understand the 
behavior of the system over time. This problem and others like it are explored 
further in the exercises at the end of this section. 
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3.8.3 RLC circuits 


The flow of electricity through a circuit, much like the flow of water in a 
pipe, naturally involves relationships with rates of change. As such, the study 
of electrical current involves differential equations. Here, we explore some 
fundamental properties of electricity and how these lead to such equations. 

Throughout what follows, we will make use of the analogy that the flow of 
charge carriers in an electrical circuit is like the flow of particles in a moving 
stream of water. Just as we consider flow of water in a pipe to be the number 
of water particles flowing past a given point during a certain time interval, the 
current I(t) in a circuit at time ¢ is proportional to the number of positive 
charge carriers that move past any given point per second in the conductor. 
Note particularly that current measures a rate of change of charge. 

Current is measured in amperes(amp), the base unit through which all other 
units will be defined. One ampere corresponds to 6.2420 x 10!® charge carriers 
per second moving past a given point. The unit of charge is a coulomb, which is 
the amount of charge that flows through a cross section of a wire in one second 
when a one amp current is flowing. In other words, 


1 amp = 1 coulomb/s 


Here, we begin to see how derivatives and integrals are involved in the study 
of electricity. The current I(t) at time t is by definition a rate of change of charge. 
Thus, by the Fundamental Theorem of Calculus, the total amount of charge that 
flows past a given point on a time interval [to, t)] is given by 


ty 
/ I(s) ds (3.8.15) 
to 


If we let Q(t) measure the total accumulated charge at a given point in the circuit 
from time fp up to time t, then we have 
t 


Qn) = lw) + fo 18) (3.8.16) 
to 
and therefore Q’(t) = I(t). 

As current flows through a circuit, the charge carriers and elements in the 
circuit exchange energy. We, therefore, define a potential function V throughout 
a circuit. The energy (per coulomb of charge) that has been exchanged by the 
charge carriers as they flow from point a to point b is computed as 


Vab = Va — Vp 


where V, and V; are the values of the potential function at points a and b in the 
circuit. 

The difference Vj, is called the voltage drop from a to b and is measured in 
joules per coulomb, which are also known as volts. If we again think of the flow 
of water through a pipe, the concept of voltage drop is analogous to the change 
in water pressure between points a and b. Batteries, for example, maintain a 
voltage drop between two terminals; the energy provided by a battery’s internal 
chemicals produces a constant amount of energy per coulomb as charge carriers 
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move throughout the battery, which raises the function V by the voltage rating 
of the battery. 

As current flows through a circuit, energy is lost. This makes the potential 
V at one end lower than the potential at the other. Over a portion of a circuit, 
say from a to b, where a substantial amount of energy is lost, we say that such a 
portion is called a resistor. Good examples of resistors are light bulbs and heating 
elements, because they show how electrical energy can be converted into light 
and heat. 

The voltage drop across a resistor and the current flowing through it are 
modeled by Ohm’s law, which says that the potential difference V,, between the 
endpoints a and b of a resistor is proportional to the current flowing through 
the resistor. In other words, 


Vap = IR (3.8.17) 


where R is a constant called the resistance. The unit of resistance is the ohm, 
which is equal to one volt per ampere, or one volt-second per coulomb. 

A changing electrical current I(t) in a segment of a circuit will create a 
changing magnetic field that results in a voltage drop between the ends of a 
segment. When this effect is large, such as in a coil between points b and c (the 
effect can be magnified by different geometrical arrangements of the circuit), 
the device that induces the effect is called an inductor. Faraday’s law tells us what 
happens with the voltage drops across inductors. In particular, the voltage drop 
across an inductor is proportional to the rate of change of the current, or, in 


other words 
V L a 3.8.18 
be = ae (3.8.18) 

where L is a constant called the inductance. Note specifically that Faraday’s law 
regards the rate of change of current. Inductance is measured in henries. 

Finally, if a circuit is broken and we include two plates separated by an 
insulating material (such as air), and the terminals of the circuit are connected 
to a voltage source (such as a battery), then charges will build up on the plates. 
In the ongoing analogy to water, this is similar to a tank used to store water to 
provide a source of pressure. We call the set of plates a capacitor, and speak of 
the total charge Q(t) on the capacitor. 

From (3.8.16), since we know that current I is the rate of change of charge 
Q, if we know an initial charge Q(t), then given a current I(t) we can find the 
charge Q(t) by the relationship 


t 


Qn) = (00) + fs) (3.8.19) 


to 
Finally, Coulomb’s law states that the voltage drop V.q across a capacitor between 
points c and d is proportional to the charge on the capacitor, or 


t 
Ved = ZOU) = 3 (ata) + [ 1(94s) (3.8.20) 


where C is called the capacitance of the capacitor and is measured in farads. 
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All three of the laws (3.8.17), (3.8.18), and (3.8.20) are based on 
experimental observations of circuits. Similarly, Kirchoff’s law is a conservation 
law that tells us what we can expect for the voltage drops across various parts of 
a circuit. Simply stated, Kirchoff’s law says that if we pick a sequence of points 
in a closed circuit, then the sum of the voltage drops across these segments is 
zero. Specifically, for points a1, a2,..., Ans 


Vaya + Vaya3 + +++ + Vay—yan + Vana, = 0 (3.8.21) 


A final necessary law for us to consider is Kirchoff’s current law, which tells 
us that at each point of a circuit, the sum of currents flowing into a point 
equals the sum of the currents flowing out. For a simple RLC circuit with one 
loop, Kirchoff’s current law guarantees that we can use a single function I(t) to 
model the current at any point at a given time f; for circuits with multiple loops, 
multiple functions I(t) are needed. 

Now we are prepared to see how these fundamental laws of electricity lead 
to a second-order differential equation, and hence a 2 x 2 system of first-order 
DEs. Let us consider an RLC circuit that consists of a resistor, inductor, and 
capacitor, along with some energy (voltage) source E(t), arranged in series, as 
shown in figure 3.17. Kirchoff’s law leads us directly to second-order differential 
equations that determine the behavior of the current I(t) in the circuit and the 
charge Q(t) on the capacitor. 


By Ohm’s law, we know that V,, = IR. Similarly, Faraday’s law implies that V,, = 
14 and Coulomb’s law tells us that Vg = Z Q(t) = = (Q(t) +fe I(s) ds). 
Finally, we know from the voltage source that Vz, = — E(t). Kirchoffs law now 
yields the equation Vay + Vice + Ved + Vda = 0, or 


RI(t) + LI'(t)+ = Q(t) = E(t) (3.8.22) 


E(t) 


Figure 3.17 An RLC circuit with resistance 
R, inductance L, capacitance C, and energy 
source E(t). 
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Recalling that Q’(t) = I(t), we may rewrite (3.8.22) in two different ways. If we 
differentiate both sides of (3.8.22), and rearrange the terms in decreasing order 
of derivatives, it follows immediately that the current I(t) must satisfy the linear 
second-order differential equation 


LI"(t) + I(t) + SH) = EW) (3.8.23) 


If instead we substitute Q’ for I in (3.8.22), then we see that Q is the solution to 
the linear second-order differential equation 


LQ"(t) + RQ'(t) + 51) — E(t) (3.8.24) 


We can therefore study the behaviors of different RLC circuits based on the given 
resistance, inductance, capacitance, and supplied voltage. Moreover, as we well 
know, any such linear second-order differential equation may be converted to a 
system of first-order equations. For example, letting x; = I and x; = I’, we can 
convert (3.8.23) to the system of equations 


/ 
x) = x2 

1 
CL 


R 1 
Xy = ——— x] — pet Ew 


Example 3.8.1 Determine all solutions I(t) for an RLC circuit when L = 20 H, 
R=80 2, C=10~ F, and the external voltage is given by the function 
E(t) = 50sin2t. 


Solution. From (3.8.23) and the given information, we can immediately 
determine the second-order differential equation that I(t) satisfies. In particular, 
since E(t) = 50sin 2t, we have E’(t) = 100cos2t, and using the values for L, C, 
and R, I(t) is a solution to the equation 


201” + 801’ + 1001 = 100cos2t (3.8.25) 


Using the substitution x; = I and x) = I' and multiplying both sides of (3.8.25) 
by 1/20, the system becomes 


i. 
x) = X2 
x} = —5x1 — 4x) +5cos2t 


From this, we can write the system in matrix form as 


,_| O 1 0 
=> E _a}e+ eee 5.26) 


For the coefficient matrix A in (3.8.26), we compute the eigenvalues and 
eigenvectors in order to find the complementary solution x, of the system. 
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Doing so, we find that A has complex eigenvalues and eigenvectors; one 
eigenvalue-eigenvector pair is 


. —2-1 
.=-2tiv=| 5 | 


Writing 


we know from theorem 3.5.2 that the real and imaginary parts of the 
vector function z(t) will form two real linearly independent solutions to the 
homogeneous system x’ = Ax. Rewriting z using Euler’s formula, 


z(t) = e 7" (cost + isin t) (i +1 41) 


_9;| —2cost+sint . 3, | —cost —2sint 
=e! u ape ; 
5cost 5sint 


The real and imaginary parts of z are real linearly independent solutions to 
x’ = Ax, so we have determined that the complementary solution to the original 
system is 


_2¢} —2cost+sint _2¢} —cost—2sint 
Xp = ce me : 
i : 5 cos t +2 5sint 


In theory, we are now ready to apply variation of parameters to find a particular 
solution xp. While we could do so here, the computations get remarkably 
cumbersome. In the next chapter on higher order differential equations, we 
will learn that for certain higher order equations, making a good guess at the 
form of a particular solution provides the simplest approach. In fact, we will 
even see that keeping certain second-order equations in that form, rather than 
converting them to systems of first-order equations, often is the best way to 
proceed. 
For now, we will guess a form for xp. Since 


0 
b()= ea 


we assume that a particular solution xp has form 


Bees ee 
Xp = 


ccos2t+dsin2t 
From this, it follows 


, _ | —2asin2t +2bcos2t 
y= —2csin2t+2dcos2t 
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Substituting xp and X> for x and x’ in (3.8.26), we have 


_ ccos2t+ dsin2t 
~ | —5acos2t — 5bsin2t — 4c cos2t — 4dsin2t 


0 
T 5 cos | 


Equating the coefficients of sin2t and cos2t in the entries of the vectors in this 
most recent vector equation, the following system of four linear equations in a, 
b, c, and d arises: 


—2asin2t +2bcos2t 
—2csin2t+2dcos2t 


—2a=d 
2b=c 
—2c=—5b—4d 


2d=—5a—4c+5 


Rearranging this system to write it in matrix form and row-reducing, we observe 


—2 0 O —-!1 0 100 0~— 1/13 
02-1 00 as 010 0 8/13 
05 -2 4 0 001 0 16/13 

ee 0 4 2°55 i 0 0 1 —2/13 


Thus we conclude that a particular solution is 


1/13 cos2t+ 8/13 sin2t 
P| 16/13 cos2t — 2/13sin2t 
In conjunction with our earlier work to find x), we have determined that 


the general solution to the system of first-order differential equations given 
by (3.8.25) is 


_9,| —2cost+sint _»,| —cost—2sint 

= 2t 2t 

ee 5cost ]+e0 a 
1/13 cos2t + 8/13 sin2t 
16/13cos2t — 2/13 sin2t 


Recalling that x; = I is the current in the given RLC circuit, we have 
shown that 


F : 1 8, 
I(t) =c,e~7"(—2. cost + sint) + oe ~' (— cost — 2sint) + eo aug 


Given initial conditions for I(0) and I'(0), we can find the values of the constants 
c, and 2. Moreover, we note that as t > oo, the components of the solution that 
include e~*! will die off, leaving us with long-term behavior of I(t) modeled by 
_ cos2t + x sin2t. We hence call _ cos2t + 4 sin 2t the steady-state solution 
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of the original equation (3.8.25) and c,e~*!(—2 cost + sin t) + cge~*! (— cost — 
2sin t) the transient solution. 

Overall, we have now seen several examples of important phenomena 
governed by linear systems of differential equations. Further examples will be 
considered in the exercises. 


Exercises 3.8 


1. Ina closed system of two tanks (i.e, one for which there are no input flows 
and no output flows), the following information is given. Tank A is filled 
with 100 liters of solution whose initial concentration is 0.25 g/liter. Tank 
Bis filled with 50 liters of solution whose initial concentration is 1 g/liter. 
The two tanks are connected with two pipes having flows in opposite 
direction; mixed solution from Tank A flows to Tank B at a rate of 
4 liters/min. Similarly, mixed solution flows from Tank B to Tank A at a 
rate of 4 liters/min. 


Set up and solve an initial-value problem whose solution will tell you 
the amount of salt in each tank at time t. Discuss the graphical behavior 
of the solution x(t) (whose components are the amount of salt in each 
tank at time tf). Is there an equilibrium solution to the system? If so, 
what is it? 


2. Consider a system of two tanks connected in such a way that each of the 
tanks has an independent inflow that delivers salt solution to it, each has 
an independent outflow (drain), and each tank is connected to the other 
with an outflow and an inflow. The relevant information about each tank 
is given in the table below. 


Tank A Tank B 
Tank volume 100 liters 200 liters 
Rate of inflow to the tank 5 liters/min 9 liters/min 
Concentration of salt in inflow | 7 g/liter 3 g/liter 
Rate of drain outflow 4 liters/min 10 liters/min 
Rates of outflows to other tank | to B: 3 liters/min | to A: 2 liters/min 


Initially, Tank A has 20 g of salt present in its solution, and Tank B has 
75 g of salt present in its solution. 


Set up and solve an initial-value problem whose solution will determine 
the amount of salt in each tank at time t. Discuss the graphical behavior 
of the solution x(t) (whose components are the amount of salt in each 
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tank at time f). Is there an equilibrium solution to the system? If so, 
what is it? 


3. Suppose that in exercise 2 all of the given information remains the 
same except for the fact that instead of saltwater flowing into each 
tank, pure water flows in. How do the results of your work in exercise 2 
change? 


4. In a closed system of three tanks (that is, one for which there are no input 
flows and no output flows), the following information is given. 


Tank A Tank B Tank C 


Tank volume 100 liters 150 liters 125 liters 


Rates of outflows 


to B: 3 liters/min 


to other tanks 


to C: 1 liters/min 


to A: 4 liters/min 


Rates of outflows 
to other tanks 


to C: 4 liters/min 


to A: 3 liters/min 


to B: 1 liter/min 


Tank A is filled with 100 liters of solution whose initial concentration is 
8 g/liter. Tank B is filled with 150 liters of solution whose initial 
concentration is 3 g/liter. Tank C is initially filled with 125 liters of pure 
water. The three tanks are connected with pipes having flows in opposite 
directions; flow rates are given in the table above. 


Set up and solve an initial-value problem whose solution will tell you 
the amount of salt in each tank at time t. Discuss the graphical behavior 
of the solution x(t) (whose components are the amount of salt in each 
tank at time f). Is there an equilibrium solution to the system? If so, 


what is it? 


5. Ina system of three tanks of saltwater, the following information is given. 


Tank A 


Tank B 


Tank C 


Tank volume 


400 liters 


200 liters 


300 liters 


Rate of inflow 
to the tank 


7 liters/min 


0 liters/min 


0 liters/min 


Concentration of 
salt in inflow 


10 g/liter 


n/a 


n/a 


Rate of drain outflow 


0 liters/min 


0 liters/min 


7 liters/min 


Rates of outflows 
to other tanks 


to B: 7 liters/min 


to C: 7 liters/min 


to A: 0 liters/min 


Rates of outflows 
to other tanks 


to C: 0 liters/min 


to A: 0 liters/min 


to B: 0 liters/min 
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Each tank is full; tank A contains solution whose initial concentration is 
20 g/liter. Tank B contains solution whose initial concentration is 
50 g/liter. Tank C contains pure water. 


Without setting up a system of differential equations, first use your 
intuition to describe what you think will be the behavior of the functions 
x(t), x2(t), and x3(t) that measure the amount of salt in each of the three 
respective tanks at time f. 


Then, set up and solve an initial-value problem whose solution will tell 
you the amount of salt in each tank at time t. Discuss the graphical 
behavior of each component of the solution x(t) and compare it to your 
intuitive expectations. Is there an equilibrium solution to the system? If so, 
what is it? 


. Ina system of three tanks of saltwater interconnected with pipes of inflow 


and outflow to and from each, the following information is given. 


Tank A 


Tank B 


Tank C 


Tank volume 


400 liters 


800 liters 


500 liters 


Rate of inflow 
to the tank 


5 liters/min 


10 liters/min 


5 liters/min 


salt in inflow 


Concentration of 


25 g/liter 


15 g/liter 


40 g/liter 


Rate of drain outflow 


4 liters/min 


7 liters/min 


9 liters/min 


Rates of outflows 
to other tanks 


to B: 6 liters/min 


to C: 5 liters/min 


to A: 4 liters/min 


Rates of outflows 
to other tanks 


to C: 4 liters/min 


to A: 5 liters/min 


to B: 1 liter/min 


Assume that the system is such that initially there is a concentration 


of 10 g/liter of salt in each of the three tanks. Set up and solve an 
initial-value problem whose solution will tell you the amount of salt in 
each tank at time ft. Discuss the graphical behavior of each component of 
the solution x(t). Is there an equilibrium solution to the system? If so, 


what is it? 


. Recall that for a spring-mass system of mass m, spring constant k, and 


damping constant c, the displacement y(t) of the mass from equilibrium 
is governed by the linear second-order differential equation 


c k 1 
yt yt y= 
m m m 


For a mass of 0.5 kg with spring constant k = 2 N/m in an undamped, 
unforced system, assume the mass is displaced 0.4 m from equilibrium 
and released (i.e., y(0) = 0.4 and y’(0) = 0). 


10. 
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(a) State the second-order IVP that models this situation. 

(b) Convert the second-order equation to a system of first-order DEs 
using the standard substitution: x; = y, = y’. 

(c) Solve the system in (b), and graph the component function x;(t). 


Discuss the long-term behavior of the spring-mass system. 


. For a mass of 0.5 kg with spring constant k = 2 N/m and damping 


constant c = 0.5 N-s/m in an unforced system, assume the mass is 
displaced 0.3 m from equilibrium and released. 


(a) State the second-order IVP that models this situation. 

(b) Convert the second-order equation to a system of first-order DEs 
using the standard substitution: x; = y, » = y’. 

(c) Solve the system in (b), and graph the component function x; (ft). 


Discuss the long-term behavior of the spring-mass system. 


. For a mass of 0.5 kg with spring constant k = 2 N/m and damping constant 


c = 0.5 N-s/min a forced system with forcing function F(t) = cos2t N, 
assume the mass is initially displaced 0.3 m from equilibrium and released. 


(a) State the second-order IVP that models this situation. 

(b) Convert the second-order equation to a system of first-order DEs 
using the standard substitution: x) = y, 2 = y’. 

(c) Use variation of parameters to solve the system in (b), and graph the 
component function x(t). Discuss the long-term behavior of the 


spring-mass system. 


In section 3.8.2, we considered a system of two masses attached to two 
springs in parallel, where a mass my is attached to a spring with spring 
constant k; and from mj) a second spring with constant ky and mass mp is 
attached. See figure 3.16. 


If we assume that the surface on which the masses rest is frictionless and 
let let x; (t) denote the displacement of m from its equilibrium position 
and x(t) the displacement of m from its equilibrium position and set the 
system in motion, then the system is governed by the system of second 
order differential equations 


k k 
xt! = —— xy + (xy — x1) 
my my 
ky 
yo _ 
x= os (x2 — x1) 


(a) Suppose that ky = 2, m, = 1, ky = 4 and m) = 0.5. Using the given 
constant values and the substitution y) = x1, y2 = i — Kes 3 = X25 
y4 = V5 = x}, convert the system of two second-order equations to a 
system of four first-order equations. 

(b) Assume that the masses m and mp are each displaced 1 unit from their 
natural equilibrium and released. That is, assume x) (0) = 1, x} (0) =0, 
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x2(0) = 1, and x}(0) = 0. Solve this initial-value problem using the 
system in (a) and sketch the plots of y; and y3 and discuss what they 
tell you about the system. 


11. Recall that the current I(t) in an RLC circuit is governed by the linear 
second-order differential equation 


LI(t)+RI'(t)+ alte) =F) 


where L is the inductance, R the resistance, and C the capacitance of the 
circuit. 


Suppose we have an RLC circuit for which an inductor of L = 1 henry and 
capacitor C = 0.01 farad are present. Assume further that [(0) = 100 and 
I'(0) =0. 


(a) State a second-order IVP whose solution is I(t), the current at time ¢. 

(b) Convert the IVP in (a) to a system of first-order IVPs using a standard 
substitution. 

(c) Solve the system in (b) to determine the current I(t) in the cases where 
the resistance is (i) R= 0 Q, (ii) R= 16 Q, (iii) R= 20 Q, and 
(iv) R= 25 Q, assuming consistent units. Sketch a plot of each solution 
I(t) and discuss the impact that changing R has on the current. 


12. Suppose we have an RLC circuit for which an inductor of L = 1 H, resistor 
R= 16 Q, and capacitor C = 0.01 F are present. Assume further that 
I(0) = 100 A and I’(0) = 0. Finally, suppose that the system is provided a 
voltage source of E(t) = 100sin 10t 


(a) State a second-order IVP whose solution is I(t), the current in the 
circuit at time f. 

(b) Convert the IVP in (a) to a system of first-order IVPs using a standard 
substitution. 

(c) Solve the system in (b) to determine the current I(t) at time t. Sketch a 
plot of the solution I(t) and discuss the impact the forcing function 
has on the current. 


3.9 For further study 
3.9.1 Diagonalizable matrices and coupled systems 


We have seen that in the case where a system of linear first-order differential 
equations is uncoupled, such as 


l=[o 2][2]=[22] 


the system is particularly straightforward to solve. In addition, even when the 
coefficient matrix A of the system x’ = Ax is not a diagonal matrix, in the 
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case where A is n x n and has n real, linearly independent eigenvectors, it is 
again a straightforward exercise to determine the general solution to x’ = Ax. 
In what follows, we investigate the connections between A having n real linearly 
independent eigenvectors and the system being uncoupled. 


(a) Solve the uncoupled system of linear first-order equations 


}=[o 2][2]=[22] 


by directly solving the two individual equations x; = 3x; and x, = —2xp. 


=f 


how are your solutions in (a) to the individual differential equations 
related to the eigenvalues and eigenvectors of A? 


(b) For the coefficient matrix 


(c) Determine the eigenvalues and eigenvectors of the matrix A= : i and 
show that A has two real, linearly independent eigenvectors. 


(d) Let D be the diagonal 2 x 2 matrix whose diagonal entries are 4 and 2, 
the eigenvalues of A from (c), and let P be the 2 x 2 matrix whose columns 
are x; and xz, the eigenvectors of A corresponding to 41 and A2. Show that 
AP = PD. 


(e) More generally, let A be an n x nm matrix with n linearly independent real 
eigenvectors X1,X2,...,X» that correspond to real eigenvalues 
A1,A2,---,Ay- As in (d), let D be the diagonal matrix whose diagonal 
entries are the eigenvalues of A and P be the matrix whose columns are the 
corresponding eigenvectors of A. Explain why AP = PD and thus why 
A=PDP™! and D=P™'DP. 


A real n x n matrix A with the property that it has n real, linearly 
independent eigenvectors is called diagonalizable. When we factor A in the 
form A = PDP™!, we say that we have diagonalized the matrix A. 


(f) For a 2 x 2 diagonalizable matrix A, consider the system of differential 
equations given by x’ = Ax. Let D and P be the matrices defined above in 
(d). Note that in this problem A is a arbitrary diagonalizable matrix: we 
are not specifying the values of A; and A, nor the values of the entries in 
the corresponding eigenvectors. 


(i) Let y = P~!x. Show that x’ = Py’. 

(ii) Use the substitution y = P~!x and the fact that A= PDP™! to show 
that the original system x’ = Ax may be equivalently represented by 
the system y’ = Dy. 

(iii) Explain why the system y’ = Dy is preferable to the system x’ = Ax. 
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(g) For the matrix A= E , solve the system x’ = Ax by executing the 
following steps. 


(i) Diagonalize A by determining matrices D and P such that A= PDP™!. 
Recall that D is the diagonal matrix whose diagonal entries are the 
eigenvalues of A and P is the matrix whose columns are the 
corresponding eigenvectors of A. 

(ii) Follow your work in (f) to introduce a substitution that converts the 
system x’ = Ax to a new system in the variable y that is uncoupled and 
of the form y’ = Dy. 

(iii) Solve the uncoupled system in (ii) for y. 

(iv) Determine the solution x to the original system by showing that 

x = Py and using this substitution appropriately. 


(h) Solve the system x’ = Ax given by 


using the approach outlined in (g). 
(i) Solve the system x’ = Ax given by 


3 -1 1 
A=];-l1 3 -l 
1 —-l 3 


using the approach outlined in (g). 


(j) Compare your work in (g)—(i) to how you learned to solve the system 
x’ = Ax in section 3.3. Is this new approach fundamentally the same or is 
it markedly different? Explain. 


3.9.2 Matrix exponential 


An important result in calculus is that e* can be represented by its Taylor series 
expansion 
6 x a 
e=1l+xt—4+—4--+—4- (3.9.1) 
2! 3! n! 


and that (3.9.1) holds for every real value of x. In what follows, we explore the 
notion of e4, where A is a matrix, through the use of an analogous expansion, 
as well as the role of e“ in the solution of systems of differential equations of the 
form x’ = Ax. 


(a) Let A be the diagonal matrix 
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Explain why 


(b) For the matrix A in (a), show that 


32 3n 
Aa A bob bata TE te 0 
2! n! 


2 n 
0 a RE 
(3.9.2) 


Based on the entries in the right-hand matrix of (3.9.2), explain why it is 
reasonable to write that 


A i 2 1 3 1 n 
eS =I+A+4+—A?4+—AP+---+—A"+.- (3.9.3) 
a 31 n! 


We use (3.9.3) as the definition of e“ for any diagonal matrix A. 


(c) Now consider the matrix B = E =| Find the eigenvalues and 
eigenvectors of B and diagonalize B by writing 
B=PDP ' 


where D is the diagonal matrix whose diagonal entries are the eigenvalues 
of B and P is the matrix whose columns are the corresponding 
eigenvectors of B. For more on the notion of a matrix being 
“diagonalizable’, see subsection 3.9.1. 


(d) For an arbitrary diagonalizable matrix B for which B= PDP~! (where D 
and P have the meaning ascribed in (c)), show that 


B"=Pp"p"! 


(e) For an arbitrary diagonalizable matrix B, explain why 


1 2 1 3 1 n 1 2 1 3 
1+B+—B’+—B?+---+—B"+...-=P(I+D+—D’+—D*+... 
2” 3! n! a BI 


1 
+ap"+-)eo 
n! 


again where D and P have the meaning ascribed in (c). We thus define e8 
for any diagonalizable matrix B by the equation 


B 1 2 1 3 1 n 
e® ~1+B+—B’+—B>+---+—B" +... (3.9.4) 
an n! 


(f) Show that if B is any diagonalizable matrix such that B= PDP~! (where D 
and P have the meaning ascribed in (c)), then 


e® — PePp-! 
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(g) Use the result in (f) to compute e® for the specific matrix B given in (c). 


(h) Recall that when we solve a single homogeneous linear first-order DE 
such as 


/ 
y =o5y 
one way to solve the equation is to guess that the solution is y = e™ and 
work to determine the value of r that satisfies the DE. Of course we find 


that r= 5 and y = Ce”! is the general solution. Indeed, for any constant 
a, the solution to y’ = ay is y = Ce. 


Now let this consider solving the system of differential equations 


) a. <0 
x =Ax=|5 4 (3.9.5) 


noting that A is the diagonal matrix from (a) above. 


(i) Viewing t as a scalar multiplier of A, update your work from (3.9.3) to 
write a series expansion for e4*. 
(ii) Noting that e“¢ is a matrix, explain why it is reasonable to guess that 
W(t) = eA is a solution matrix for the system x’ = Ax. 
(iii) Using your expression from (i) for ¥(t) = eA’, compute both W(t) 
and AW(t) to verify that the matrix function (tf) satisfies the 
equation W’(t) = AW(t). 


4. 


Higher order differential equations 


4.1 Motivating equations 


Through our study of linear systems of differential equations, we have already 
encountered higher order differential equations that arise naturally in physical 
applications. Two particularly important ones are those associated with spring- 
mass systems and RLC circuits. Here, we briefly revisit these equations. 

In section 3.1, we considered a mass m suspended from a spring with spring 
constant k that is subject to damping with proportionality constant c. If F(t) is 
an external forcing function on the system, then the displacement y(t) of the 
mass from equilibrium satisfies 


my” + cy’ + ky = F(t) (4.1.1) 


This is a nonhomogeneous linear second-order differential equation. While we 
have already studied this equation by using the substitution x; = y and x) = y’ 
and considered the resulting linear system of first-order differential equations, 
there is further insight to be gained by examining (4.1.1) solely as a second- 
order equation. In fact, while it is theoretically possible to solve (4.1.1) using 
the corresponding linear system and ideas from chapter 3, doing so in the cases 
where F(t) 4 0 is often cumbersome; we will see in section 4.4 that this equation 
may often be solved in a straightforward manner by leaving it in its original form 
as a second-order equation. 

In section 3.8, we encountered another important nonhomogeneous linear 
second-order differential equation. By viewing the flow of electricity through 
a circuit as analogous to the flow of water in a pipe, we came to understand a 
differential equation that models the current I(t). Using results from physics, 
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including Ohm’s law, Faraday’s law, and Coulomb’s law, we learned that the 
current I(t) must satisfy the linear second-order differential equation 


1 
LI’ + RI + cl = E(t) (4.1.2) 


where L is the inductance, R is the resistance, C is the capacitance, and E(t) 
represents an external voltage source. 

We note specifically that the governing differential equations for spring- 
mass systems and RLC circuits are both linear nonhomogeneous second-order 
differential equations with constant coefficients. These differential equations 
therefore merit further study as we endeavor to more fully understand these 
physical systems. When the damping constant c = 0 and the resistance R = 0 
in (4.1.1) and (4.1.2), these equations are often called harmonic oscillator 
equations. When small damping or resistance is present, we refer to them as 
damped harmonic oscillators. 


4.2 Homogeneous equations: distinct real roots 


If we consider our experience with single homogeneous linear first-order 
differential equations and systems thereof, we realize that the exponential 
function plays a central role in their solution. For example, if we solve the 
equation 


y'—5y=0 


the solution is y = ce°’. Likewise, if we solve the system given by x’ = Ax, where 
A is a matrix with eigenvalues 1 = 2 and A = —3, then the general solution is 


x=C ey + oe vy» 


where v; and v2 are eigenvectors that correspond to the eigenvalues 4 = 2 and 
A= -3. 

Given this prominence of the exponential function, it is not surprising 
that functions of the form y = e” play a central role in our study of higher 
order equations. For example, consider the second-order linear homogeneous 
differential equation with constant coefficients given by 


yy —6y=0 (4.2.1) 


Even without our experience with first-order equations and systems, it is 
reasonable to think that one or more functions of the form y = e” will be 
a solution to this equation because of the question the equation begs: “what 
function y is such that its second derivative minus its first derivative is equal to 
6 times itself?” In essence, we are looking for a function y such that a certain 
linear combination of the function, its first derivative, and its second derivative, 
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is the zero function. This makes it natural for us to expect that the solution is 
such that its derivatives are scalar multiples of itself, hence leading us to consider 
y= et 

Letting y = e, we observe that y’ = re” and y"” = r7e". Substituting these 
functions into (4.2.1) requires r to satisfy the equation 


re™ — re —6e™ =0 (4.2.2) 
Factoring, we can rewrite (4.2.2) as 
e"(r? —r—6)=0 


and since e” is never zero, it follows that r must be such that r? — r—6 = 
(r — 3)(r +2) = 0. From this, r = 3 or r = —2, and therefore y; = e! and 
yy = e~! are both solutions to (4.2.1). 

Since y; = e*" is nota scalar multiple of y2 = e~*“, it follows that y, and yp are 
linearly independent solutions to (4.2.1). Through our work with homogeneous 
linear systems, we are accustomed to taking linear combinations of linearly 
independent solutions in order to form a general solution; the same principle 
holds here, which we will verify directly. Letting y = cy, + @y2 = cy e+ oe, 
it follows that y’ = 3ce* — 2~me~? and y” = 9qe*' + 4qe~*". If we now 


consider y” — y’ — 6y, we have 
y"—y —b6y=(9, a 4oe**) _ (3c e7* _ 2oe 7) — 6(cje"" + oe **) 
= (9c, e"' — 3c,e** — 6ce**) + (4Q0¢ 7! +206 7 —6e 7) 
=0 


Thus, we have shown that every function of the form y = q ef 4+o¢e 7 isa 


solution to (4.2.1). This shows that the solution space of (4.2.1) is at least two- 
dimensional; might there be any other linearly independent solutions to the 
equation? By our earlier work with systems, we know that the solution space of 
the equation x’ = Ax, where A is 1 x n, is n-dimensional. Since the second-order 
equation (4.2.1) can be converted to a 2 x 2 system of equations, it follows that 
its solution space has dimension exactly 2, and thus 


y= +ae7 (4.2.3) 


is the general solution to (4.2.1). 

Our work to show that if y; and y are solutions to (4.2.1), then y = cy, + 
@y2 is also a solution may be generalized to any homogeneous linear second- 
order differential equation. We state this result in the following theorem. 


Theorem 4.2.1 If y; and yz are solutions to the second-order linear 
homogeneous equation 
y" +a(t)y’ + b(t)y =0 


then y = c,y, + ©) is also a solution for any constants c; and c. 
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The important roles the constants c, and c play are further exemplified by 
initial-value problems. For example, if we consider the initial-value problem 


y'-y'-oy=0, y(0)=2, y'(0)=1 (4.2.4) 


we can show that this IVP has a unique solution. Using the general solution 
y(t) = ce?! + ~e~*", the condition y(0) = 2 implies that 


2=aq+0 (4.2.5) 


Differentiating the general solution, we find that y’(t) = 3c, e?! — 2ce-*', and 
therefore y’(0) = 1 implies 


1= 3c, —2 (4.2.6) 


Equations (4.2.5) and (4.2.6) form a linear system of two equations in two 
unknowns. Solving this system, c; = 1 and c = 1, so that the function 


y(t) = et zi et 


is the unique solution to (4.2.4). 

Our work with the example equation y” — y’ — 6y = 0 is indicative of 
many broader trends in the study of second-order linear differential equations. 
Because such equations can be converted to systems, we should not be at all 
surprised to learn that a broad class of initial-value problems associated with 
second-order equations have unique solutions, nor that the general solution to 
a second-order equation belongs to a two-dimensional solution space. We state 
two theorems in order to formalize these observations. 


Theorem 4.2.2 Consider the second-order initial-value problem given by 


y'+p)y+q(hy=f) y(o)=y, y(o) =n (4.2.7) 


where the coefficient functions p(t) and q(t) and the forcing function f(t) are 
continuous on an open interval (a, b). Given any fg in (a, b), (4.2.7) has a unique 
solution in (a, b). 


While the proof of theorem 4.2.2 is beyond the scope of this book, it is 
notable that in the case that p(t) and q(t) are constant functions, we can prove 
the theorem. Indeed, we will do so by actually constructing the solution in 
various cases in this section and those following. 

Just as we almost exclusively considered matrices A with constant entries 
in our work with systems of linear first-order differential equations of the form 
x’ = Ax, in our study of second-order linear differential equations, we will 
normally consider the situation where the coefficient functions p(t) and q(t) 
are constant. For this context, we can deduce the following result. 


Theorem 4.2.3. The set of all solutions to the second-order homogeneous 
linear differential equation y” + ayy’ + agy = 0, where ap and ay are constants, 
is a vector space of dimension 2. 
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This result can be viewed as a consequence of theorem 3.3.2 for linear 
systems of differential equations with constant coefficients. In particular, given 


y"+ay' +ay =0 (4.2.8) 


if we use the standard substitution x; = y, x. = y’, then it follows that (4.2.8) is 
equivalent to the system 
x’ = Ax= Y : x 
—day —ay 


which has a two-dimensional solution space. 

Thus, in order to solve (4.2.8), we seek two linearly independent solutions 
that satisfy the equation. In particular, if we can find two functions y, = e”’ 
and y, = e”! that are both solutions to (4.2.8), where r,; 4 r2, then the general 
solution must be 


y=ce" + ce! 


More specifically, if we recall our earlier approach following (4.2.1) in the first 
example in this section, we made the assumption that a solution y has form 
y =e". Doing so and substituting in the general equation y” + ayy’ + ayy = 0, 
we see that r must satisfy 


re™ + ayre™ + age" =0 (4.2.9) 
Since e” is never zero, it follows that r must be a solution of the characteristic 
equation of the second-order homogeneous linear equation (4.2.8), which is 


r+ayr+a)=0 (4.2.10) 


If r; and rp are the roots of (4.2.10), then it follows that y; = e"! and y) = e”! 
are both solutions to the original equation (4.2.8). In particular, if 7) A 12, then 
y, and y2 are linearly independent and we have found the general solution 
to (4.2.8), which is 

y=cye™ + ce”? 


We state this result formally in the following theorem. 


Theorem 4.2.4 Given the second-order linear differential equation with 
constant coefficients 


y"+ay' + ay =0 
if the characteristic equation r? + a;r + ay = 0 has two distinct real roots rj 
and r2, then the general solution to (4.2.4) is 


y=ce" + ce! 


We close this section with an example. 
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4 


Figure 4.1 A plot of the solution y(t) to the 
IVP given in (4.2.11). 


Example 4.2.1 Solve the second-order initial-value problem given by 
y"+7y'+12y=0, y(0)=3, y(0)=-1 (4.2.11) 
Graph the solution and discuss its long-term behavior. 
Solution. We begin by assuming that y = e”. Direct substitution into (4.2.11) 
and removing the factor e” results in the characteristic equation 
r+7r+12=0 


Factoring, we find that (r+ 3)(r +4) = 0, and therefore, r = —3 or r = —4. 

Since the two r values are distinct, it follows that y; = e~> and y, = e~*" are 

linearly independent solutions to (4.2.11) and the general solution is 
y=qe %+oe% (4.2.12) 


Applying the given initial conditions, we can solve for c; and c). Since y(0) = 3 
and y’(0) = —1, (4.2.12) implies that 


3=q4+% 
—l=-—3¢, —4Q 
It follows c,; = 11 and cj = —8, and thus the unique solution to the given 


IVP (4.2.11) is y = 1le~*" — 8e~*". Plotting y(t) results in the graph shown in 
figure 4.1, where we clearly see the given initial behavior at t = 0 (the function 
value is 3 and the slope of the tangent line is — 1) and that the solution’s long-term 
behavior is that y(t) > 0. as t > oo. 

We can also observe from the negative constants present in the exponents of 
the general solution y = c,e~*! + ce", that every such solution must tend to 
zero as t > oo. We note that y = 0 is the only constant (equilibrium) solution 
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to the original equation y” + 7y’ + 12y = 0, and that because every solution 
tends to y = 0, we say y = 0 is a stable equilibrium. 


Exercises 4.2 In exercises 1-7, determine the general solution to the given 
second-order homogeneous linear DE. 


lL. y’—y’—12y=0 


2. y"+y' —2y=0 
3.y"—y=0 

4. y"”+3y' =0 
5.y”=0 

6. y+ 4y’+3y=0 
7.y¥"+y'-—y=0 


In exercises 8-14, solve the stated IVP. In addition, graph your solution and 
discuss its long-term behavior. Note that the general solution to each equation 
has been found in exercises 1-7. 


8. y"—y'-12y=0, y(0)=-4, y'(0)=1 
9.y' ty -2y=0, yO)=2, ¥O)= 

10. y”-y=0, y(0)=1, y/(0)=-1 

lL. y”+3y'=0, y(0)=2, y'(0)=3 
12.y"=0, y(0)=—-3, y/(0)=1 

13, y"+4y'+3y=0, y(0)=—2, y/(0)=—6 
14. y"+y'—y=0, y(0)=9, y'(0)=-3 


In exercises 15-19, construct a second-order homogeneous linear DE having 
the given functions as solutions. 


15. 4=e "=e" 


16. y) =e", yp =e > 
17.y, =e", y.=1 
18. y, = e7!, y» = e** 
19..y=1,y%=¢t 


20. Consider the second-order homogeneous linear equation 
yy” — by’ +9y =0. 


(a) Use the substitution y = e” to attempt to find two linearly 
independent solutions to the given equation. 
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(b) Explain why your work in (a) only results in one linearly independent 
solution, y1(t). 

(c) Verify by direct substitution that y, = te*’ is a solution to 
y” — 6y' + 9y = 0. Explain why this function is linearly independent 
from y; found in (a). 

(d) State the general solution to the given equation. 


21. Consider the second-order homogeneous linear equation 
yy"! —2y' +5y=0. 


(a) Use the substitution y = e” to attempt to find two linearly 
independent solutions to the given equation. 

(b) Explain why your work in (a) does not generate any real solutions to 
the given equation. 

(c) Verify by direct substitution that y; = e' cos2t and y) = e' sin2t are 
solutions to y” — 2y’ +5y = 0. Explain why these functions are linearly 
independent. 

(d) State the general solution to the given equation. 


22. Consider the second-order homogeneous linear equation y” + 4y = 0. 


(a) Use the substitution y = e” to attempt to find two linearly 
independent solutions to the given equation. 

(b) Explain why your work in (a) does not generate any real solutions to 
the given equation. 

(c) Think about familiar functions that can satisfy the condition that “the 
second derivative equals —4 times the function itself.” By making a 
natural guess and verifying by direct substitution, find two linearly 
independent functions y; and y2 that satisfy the given differential 
equation. 

(d) State the general solution to the given equation. 


Recall that in a spring-mass system, the displacement y(t) of the mass from its 
natural equilibrium is governed by the equation 


c k 1 
y" + —y +—-y=— Fit) 
m m m 


where c is the damping constant, k is the spring constant, m is the mass of the 
suspended object, and F is the forcing function. 


23. For an unforced system with c = 3, k = 2, and m = 1, determine the 
displacement of the mass at time t if the system is set in motion via the 
initial conditions y(0) = 2, y’(0) = 1. Sketch a graph of the solution you 
determine and discuss the long-term behavior of the spring-mass system. 
Assume consistent units on all constants. 


24. For an unforced spring-mass system with k = 9, c = 12, and m= 3, 
determine the displacement of the mass from equilibrium at time t if 
y(0) =0 and y’(0) = —1. Assume consistent units on all constants. 
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Recall that in a standard RLC electrical circuit, the current I(t) satisfies the 
equation 


LI"(t) + RI'(t) + alt) ~ E(t) 


where L is the inductance, R is the resistance, C is the capacitance, and E(t) 
represents an external voltage source. 


25. For an RLC circuit with no external voltage source, L = 20, R = 80, and 
C = 1/60, determine the current at time f given the initial conditions 
I(0) = 100, I’(0) = 25. Graph the solution you determine and discuss 
the long-term behavior of the current. Assume consistent units on all 
constants. 


26. For an RLC circuit with no external voltage source, L = 20, R= 0, and 
C = 1/60, determine the current at time tf given the initial conditions 
I(0) = 100, I’(0) = 25. Graph the solution you determine and discuss 
the long-term behavior of the current. Assume consistent units on all 
constants. 


4.3 Homogeneous equations: repeated and complex roots 


In the preceding section, we observed that any time the characteristic equation 
of the second-order equation y” + a,y’ + apy has two real, distinct roots, the 
general solution of the differential equation is easily determined. However, in 
an equation such as 


y" —6y'+9y =0 (4.3.1) 
with characteristic equation r? — 6r + 9 = 0, the only root of this equation is 
r = 3. Although this leads us to the solution y; = e*’, we do not immediately 


see how to find a second linearly independent solution. In a similar way, the 
equation 


y" —2y' +5y=0 (4.3.2) 
has characteristic equation is r? — 2r +5 =0 and its roots are 
r=142i 


In this case, we see that no real solution to (4.3.2) results using our previous 
approach, so it remains for us to find two real linearly independent solutions. 
Now we will endeavor to understand how to address these two cases: when 
roots of the characteristic equation are repeated and when the roots of the 
characteristic equation are complex. 


4.3.1 Repeated roots 


Let us consider the second-order homogeneous linear DE given by 


y" +4y'+4y =0 (4.3.3) 
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Its characteristic equation is r7 + 4r + 4 = (r+ 2)? = 0, so that only the 
solution y; = e~! results from the guess that y = e”. To find a second linearly 
independent solution, it is natural to think that we need to somehow complicate 
the function y = e~*', just as we did in section 3.5 when we encountered the 
similar case where the coefficient matrix of a 2 x 2 system of linear first-order 
DEs had a repeated eigenvalue. 

Thus, we consider a second potential solution 


yn = v(t)e 7" 


where v(t) is a function yet to be determined. By using this function and 
substituting into the equation y” + 4y’ + 4y = 0, we find conditions that v(t) 
must satisfy. First, observe by the product rule that 


Yj = —2ve7? + ve? (4.3.4) 
Similarly, 


yy = Ave? — Aye 7# + yet (4.3.5) 
Next, substituting into (4.3.3), we find 
O= yy +43 +4 
= (Ave! — Ay! 2? + ye 24) 4 4(—2ve2# 4 ve 4) 4 4(ve 4) 


= yet (4.3.6) 


Since e~*! is never zero, it follows that v’(t) must equal zero for all values of t. 


This implies that v(t) can be any linear function. Because all we seek is one 
function y, = v(t)e~?' that is a solution to (4.3.3) and is linearly independent 
from y; = e~", it suffices to choose v(t) = t. Specifically, 


yy = te 
is a second linearly independent solution to (4.3.3). The general solution is 
therefore 


(t) = ce! + cote ?# 
y 


The condition we derived at (4.3.6) for v(t) will hold in any situation where 
the characteristic equation of a second-order linear homogeneous DE has a 
repeated root. This leads us to state the following theorem. 


Theorem 4.3.1 For any second-order linear homogeneous differential 
equation of the form 

yl! + 2ky’ + ky —0 
whose characteristic equation has repeated real root r = —k, the general solution 
to the differential equation is 


y=ce "+ ote—* 
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Before proceeding to the case of complex roots, we consider one example to 
demonstrate theorem 4.3.1 at work. 


Example 4.3.1 Determine the general solution to the equation 


y" —10y' + 25y =0 (4.3.7) 


Solution. The characteristic equation of the given DE is r? — 10r + 25 = 
(r — 5)? = 0, which has the repeated root r = 5. By theorem 4.3.1, it follows 
that the general solution to (4.3.7) is 


y= ce’ + ote 


4.3.2 Complex roots 


We continue to be guided throughout our work with second-order linear 
homogeneous equations by the informed guess that the solution has form 
y =e". When this guess and the corresponding characteristic equation result 
in two distinct, real values of r, we have found the general solution to the given 
differential equation. Likewise, we have just shown that when the characteristic 
equation has only one real root, we can still find the general solution to the DE. 
We next explore how, even in the complex case, we can find the general solution 
through our original guess, y = e”. 
We return to the example 

y" —2y' +5y =0 (4.3.8) 
and recall that the roots of the characteristic equation are r = 1 + 27. While this 
suggests that z(t) = e(1+2i)t should bea solution of the differential equation, the 
function z(t) is complex-valued. When we encountered a similar situation in 
section 3.5 for a linear system whose coefficient matrix had complex eigenvalues 
and complex eigenvectors, we used Euler’s formula to separate such a complex- 
valued function into real and imaginary parts in order to find real solutions. 
We proceed similarly here. Recall that Euler’s formula states that e”” = cos@ + 
isin @, so 

elatbi)t — eat eibt — eat (cos bt + isin bt) 


For the complex solution z(t) to (4.3.8), we thus find that 


= e'(cos2t+ isin2t) 
= e' cos2t+ ie’ sin2t (4.3.9) 
In (4.3.9), we see that z(t) has been written in the form 
z(t) = Re(z) + iIm(z) 


where Re(z) and Im(z) are themselves real-valued functions of t. Based on our 
experience with systems of differential equations with complex-valued solutions, 
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it is natural at this point to hope that both the real and imaginary parts of z(t) 
will be linearly independent solutions to (4.3.8). 

Indeed, if we let y; = e' cos2t and y) = e‘sin2t, then it can be shown 
by direct substitution that both y, and y2 are solutions to (4.3.8). Because y; 
and y2 are not scalar multiples of each other, these two functions are linearly 
independent, and therefore, by theorem 4.2.3, it follows that 


y(t) = cye’ cos2t + ce’ sin2t 


is the general solution to (4.3.8). 

The direct substitution that is used to verify that the real and imaginary 
parts of z(t) are solutions to the original equation is somewhat tedious, but 
not difficult. In fact, in the more general case where we have complex roots 
a+ bi, it can be similarly verified by direct substitution into the corresponding 
second-order equation that y; = e” cos bt and y, = e” sin bt are each solutions 
to the equation. Note that this scenario implies that the characteristic equation 
has form C(r) = 0 where 


C(r) =[r—(a+ bi)][r — (a — bi)] 
= 1 —(a+t bi)r—(a— bi)r + (at bi)(a— bi) 
=r? —2art+(a?+b’) (4.3.10) 


This shows that, up to a scalar multiple of the equation, complex roots to the 
characteristic equation arise from second-order homogeneous linear differential 
equations of the form 


y" —2ay' + (a? +b*)y =0 (4.3.11) 


Our work above now enables us to state a formal result on finding real, linearly 
independent solutions from complex-valued ones. 


Theorem 4.3.2 Let aand b be real constants with b 4 0. For the second-order 
homogeneous linear differential equation 


y" —2ay’ +(a* +b?)y =0 


the roots of the corresponding characteristic equation are r = a+ bi and the 
general solution to the differential equation is given by 


y=ce™ cos bt + ce sin bt 


Note that it is precisely the presence of complex roots to the characteristic 
equation that produces the periodic functions cos bt and sin bt in the solution. 
In physical situations such as spring-mass systems and RLC circuits where we 
anticipate that solutions will have a sinusoidal component, we can expect that 
the characteristic equation will have complex roots. 

We conclude this section by applying theorem 4.3.2 in the following 
example. 
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Example 4.3.2 Solve the initial-value problem given by 
y" +2y'+10y=0, y(0)=1, y'(0)=1 
Plot the solution and discuss its long-term behavior. 
Solution. We first find the general solution to the given differential equation. 
The corresponding characteristic equation is r? + 2r + 10 = 0, with roots 
r=—-1+31 

By theorem 4.3.2 it follows that the general solution is 

y=ce ‘cos3t+qe ‘sin3t 
To determine the solution to the stated IVP, first note that y(0) = 1 implies that 

1 = ce’ cos(0) + ce" sin(0) 
so that cj = 1. In addition, since 
y’ =—qe ‘cos3t—oe ‘sin3t—3qe 'sin3t+3ae 'cos3t 
it follows from the fact that y’(0) = 1 that 
1=-c( +30 


Since c; = 1, we find that c = 2/3 and hence the solution to the IVP is 
_ ot 2 +. 
y=e cee sin 3t 


Plotting the function y in figure 4.2, we see that the function y(t) oscillates 


due to the presence of the trigonometric functions, while y(t) > 0 as t > oo 


because of the damping effect of e~‘. 


In fact, the graphical behavior demonstrated by y(t) in figure 4.2 is precisely 
what we would expect if the given IVP was modeling a spring-mass system where 
relatively small damping is present: the mass will oscillate once sent in motion, 
but will eventually return to equilibrium. 


Exercises 4.3. In exercises 1-9, use the characteristic equation to determine 
the general solution to the given second-order linear homogeneous differential 
equation. 


1. y” — By’ + 16y =0 
2y"+y'+y=0 

3. yy" +y' +57 =0 

4. y"”—4y=0 

5. y”+4y =0 

6. y” — 10y’ + 50y =0 
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Figure 4.2 A plot of the solution y(t) to 
the IVP given in example 4.3.2. 


7. y’ —10y’+25y =0 
8. y”=0 
9. 2y"+7y'+5y =0 


In exercises 10-18, solve the stated initial-value problem. In addition, graph 
your solution and discuss its long-term behavior. Note that the general solution 
to each equation has been found in corresponding problems in exercises 1-9. 


10. y’—8y'+l6y=0, y(0)=—-4, y(0)=1 
ll. y’+y+y=0, y0)=2, y'(0)=2 

12. y"+y'+4y=0, y(0)=0, y(0)=-1 

13. y’—4y=0, y(0)=7, y(0)=—5 
14.y”+4y=0, y(0)=2, y'(0)=3 

15. y’—10y'+50y=0, y(0)=—3, y/(0)=1 
16. y’—10y'+25y=0, y(0)=—2, y/(0)=—6 
17.y’=0, y(0)=0, y(0)=0 

18. 2y’+7y'+5y=0, y(0)=9, y'(0)=-3 


19. Consider the second-order linear homogeneous equation 
y" —6y' + 9y=0. 


(a) Find the general solution y of the given equation. 
(b) Convert the given equation to a system x’ = Ax of two first-order 
equations using the substitution x; = y, x)= y’. 
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(c) Solve the system x’ = Ax. 
(d) Compare your results for y and x,. What do you observe? 


20. Consider the second-order linear homogeneous equation 
y" + 6y’ + 10y =0. 


(a) Find the general solution y of the given equation. 

(b) Convert the given equation to a system x’ = Ax of two first-order 
equations using the substitution x1 = y, x. =’. 

(c) Solve the system x’ = Ax. 


(d) Compare your results for y and x,. What do you observe? 


21. Consider the general second-order linear homogeneous equation with 
constant coefficients given by 
y"+ary' + ay =0 
Under what conditions on a; and ao does the equation have two real 
distinct roots? one real repeated root? two distinct complex roots? 


Recall that in a spring-mass system, the displacement y(t) of the mass from its 
natural equilibrium is governed by the equation 


where c is the damping constant, k is the spring constant, m is the mass of the 
suspended object, and F(t) is the forcing function. In the following exercises, 
we assume that units on all quantities and constants are consistent. 


22. For an unforced spring-mass system with c= 2, k =1, and m= 1, 
determine the displacement of the mass at time t if the system is set in 
motion with the initial conditions y(0) = 2, y’(0) = 1. Sketch the solution 
you determine and discuss the behavior of the spring-mass system. 


23. For an unforced, undamped spring-mass system with k = 9 and m= 3, 
determine the displacement of the mass from equilibrium at time t if 
y(0) =2 and y’(0) = 1. Sketch the solution you determine and discuss 
the behavior of the spring-mass system. 


24. For an unforced spring-mass system with c = 1, k = 2, and m= 1, 
determine the displacement of the mass at time t if the system is set in 
motion with the initial conditions y(0) = 2, y’(0) = 1. Sketch the solution 
you determine and discuss the behavior of the spring-mass system. 


Recall that in a standard RLC electrical circuit, the current I(t) satisfies the 
equation 


LI" (t) + RI'(t) + alt) 2aEG) 


where L is the inductance, R is the resistance, C is the capacitance, and E(t) 
represents an external voltage source. In the following exercises, we assume that 
units on all quantities and constants are consistent. 
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25. For an RLC circuit with no external voltage source, L = 10, R = 40, and 
C = 1/40, determine the current at time t given the initial conditions 
I(0) = 100, I’(0) = 25. Sketch the solution you determine and discuss 
the behavior of the current. 


26. For an RLC circuit with no external voltage source, L = 10, R = 40, and 
C = 1/50, determine the current at time ft given the initial conditions 
I(0) = 100, I’(0) = 25. Sketch the solution you determine and discuss 
the behavior of the current. 


27. For an RLC circuit with no external voltage source, L = 10, R= 0, and 
C = 1/90, determine the current at time tf given the initial conditions 
I(0) = 100, I’(0) = 25. Sketch the solution you determine and discuss 
the behavior of the current. 


4.4 Nonhomogeneous equations 


As motivated by a spring-mass system with a driving force or an RLC circuit 
with an external voltage source, we are now interested in solving second-order 
nonhomogeneous linear differential equations of the form 


y" + ay! +aoy = f(t) (4.4.1) 


where f(t) is not zero. We already know a theoretical way to solve such an 
equation: through the substitution x; = y and x2 = y’, we can convert (4.4.1) to 
a system of two first-order equations in the form x’ = Ax + b and solve the two 
first-order DEs. While this approach works in theory, the actual execution of 
the process can be cumbersome. In fact, it is often much easier to solve (4.4.1) 
directly through the approaches we present in this section. 

Analogous to several other types of linear algebraic and linear differential 
equations, a general principle from our work with nonhomogeneous equations 
guides us throughout: we first seek a complementary solution y;(t) to the 
corresponding homogeneous equation 


y" + ary’ + any =0 (4.4.2) 
and then determine a particular solution yp(t) to the nonhomogeneous 
equation (4.4.1). It follows that y = yp + yp will be the general solution to 


the nonhomogeneous equation. Indeed, we have the following theorem, a part 
of whose formal proof will be addressed in exercise 33 at the end of this section. 


Theorem 4.4.1 Given the equation 


y+ ay’ + ay = f(t) (4.4.3) 


where ao and a are constants, if y(t) is the general solution to the 
corresponding homogeneous equation y” + a,y’ + ayy = 0 and yp(t) is any 
solution to the nonhomogeneous equation (4.4.3) then y = yz, + yp is the general 
solution to (4.4.3). 
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We already understand how to find y;,, which depends entirely on the roots 
to the characteristic equation r? + air + ap = 0 as discussed in sections 4.2 
and 4.3, It remains, however, to find yp. To do so, we explore two methods: the 
guessing technique of undetermined coefficients, and the brute force technique 
of variation of parameters. Each of these methods is analogous to those that may 
be used to solve nonhomogeneous systems of the form x’ = Ax + b. 


4.4.1 Undetermined coefficients 


At this point in our discussion, examples are instructive. We consider several 
different nonhomogeneous linear second-order DEs to see how making 
reasonable guesses for the form of yp(t) can lead to the general solution in 
many elementary cases. Throughout, we use the following idea to guide our 
choice of the form of yp(t): since the first and second derivatives of many 
functions are similar to the original function (e.g., derivatives of sine and cosine 
functions are cosine and sine functions, derivatives of exponential functions are 
exponential functions, derivatives of polynomial functions are polynomials), 
and in equations of the form (4.4.3) we take linear combinations of y, y’, and y” 
to get f(t), it is reasonable to guess that the form of yp(t) will be similar to the 
form of f(t), the forcing function in the nonhomogeneous equation. We first 
see this for polynomial functions in the first example. 


Example 4.4.1 Determine the general solution to 
y" —3y' —4y = 407 +2t-9 (4.4.4) 


Solution. For the associated nonhomogeneous equation, y” — 3y’ — 4y = 0, 


by theorem 4.2.4 the complementary solution is yj, = cje~' + me*’. 


For a particular solution, we naturally guess that yp has the form 
Yp = at? + bt+c (4.4.5) 
based on the form of the forcing function. The undetermined coefficients a, b, 
and c are found by direct substitution into (4.4.4). Note that Vp = 2at +b and 
Yp = 2a, so that from (4.4.4) we find 
a= 30a 4b) =A + bt Shr 9 
Rearranging the left-hand side of this equation, it follows 
—4at? + (—6a—4b)t + (2a—3b—4c) = 4t* +2t-—9 (4.4.6) 


Equating like coefficients of the power functions present in (4.4.6), the system 
of equations 


—4a=4 
—6a—4b=2 
2a—3b—4c=-9 
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must hold. We see that a = —1, from which it follows that b= 1 and c= 1 so 
that yp = —t? ++ 1. Combining this with y;,, we have determined that the 
general solution to (4.4.4) is 


y=qe'+oe%—4+t+1 


We can imagine that if f (ft) was a polynomial other than 4t? + 2t — 9, we would 
have guessed that yp was a general polynomial of the same degree with unknown 
coefficients. This approach almost always works; we will discuss some exceptions 
that can arise after examples involving non-polynomial forcing functions. 


Example 4.4.2 Determine the general solution to 


y! —y = 16e* (4.4.7) 


Solution. Just as in example 4.4.1, we first solve the corresponding 
homogeneous equation and find y;,. Doing so, we observe that for y” — y = 0, 
the solution y;, is 


= qe’ + oe! 
yi 


For the particular solution, we use the natural guess that yp = Ae*'. From this, 
Vp = 3Ae*! and Vp = 9Ae*", so substituting into (4.4.7), we find 


9Ae*! — Ae** = 16e* 


Equating the coefficients of e°", it follows that 8A = 16, so A = 2 and therefore 
Vp = 2e"", 
Hence we have found the general solution of (4.4.7) to be 


¥=ynt yp = ce +e * +207 


Here, we observe that if f (t) in (4.4.7) were a different exponential function, say 
of the form f(t) = Be’, we would again guess that Vp = Ae“, This is based on 
the fact that our guess for yp incorporates all the possible forms of the derivatives 
of f(t). Just as with polynomial forcing functions, this approach almost always 
works. We will consider situations where these natural educated guesses can fail 
following one more example. 


Example 4.4.3. Determine the general solution to 
y" —y' —2y = 10sint (4.4.8) 


Solution. First, we observe that the complementary solution can be shown 


to be 


Vh = cet + oe! 


To find yp, we guess that 

Vp = Asint + Bcost 
Note that we must include the cosine function in yp in order to account for the 
fact that the cosine function arises in the derivative of f(t) = 10sint. 
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From our guess for yp, it follows that y, = Acost — Bsint and y; = 
—Asint — Bcost. Substituting in (4.4.8), we see that A and B must satisfy 
the equation 


(—Asint — Bcost) —(Acost — Bsint) —2(Asint+ Bcost)=10sint (4.4.9) 


Rearranging (4.4.9) in order to compare coefficients of the sine and cosine 
functions, we have 


(—A+B—2A)sint+(—B—A-—2B)cost = 10sint 


from which it follows that —3A + B = 10 and —A — 3B = 0. Consequently, 
A=-—3 and B=1, so that Vp = —3sint + cost. Therefore we have shown that 
the general solution of (4.4.8) is 


*_ 3sint+ cost 


V=Yn+Yp = ce +.e~ 
In the more general setting where we imagine the forcing function f(t) involving 
sin kt or cos kt, it will be natural to make the guess that yp = Asin kt + Bcos kt, 
which again will work in most cases. 

We have hinted that while the method of undetermined coefficients 
will usually work, it can occasionally fail. What can go wrong? First, if the 
forcing function f(t) is particularly complicated, this can make determining a 
reasonable guess for yp challenging. Moreover, even if f(t) is a relatively simple 
function whose derivatives take on unusual forms—for example, f(t) = Int, 
where f’(t) and f”(t) are not logarithmic—we may find it difficult or impossible 
to find a form of yp that works. These two situations will be addressed by the 
variation of parameters method that we introduce in the next subsection. 

In addition, there is one more case in which undetermined coefficients can 
fail, yet the difficulty is straightforward to reconcile. An example is instructive. 


Example 4.4.4 Find the general solution to the differential equation 
y"—y=16e* (4.4.10) 


Solution. Note that this differential equation is nearly identical to the one 
considered in example 4.4.2, but here the forcing function is f(t) = 16e~‘, 
rather than f(t) = 16e°'. 

As above, it still holds that y, = cje’ + me~'. In addition, we naturally 
guess that yp = Ae~', from which it follows that y, = —Ae~' and y, = Ae~’. 
Substituting in (4.4.10), we have 

Ae‘ — Ae’ = l6e* 
But this last equality is clearly impossible, regardless of the value of A, since 
0 = 16e~‘ is never true. 

We can determine where the method failed by observing that in this case, 
our guess for the particular solution yp was actually part of the complementary 
solution. Note that y;, = ce’ + @e~‘, from which it follows that yp cannot have 
the form Ae~‘, since this latter function belongs to yp. 
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We therefore need a more complicated guess for yp; a natural one to 
attempt is 


Yp = Ate * (4.4.11) 
where we have introduced the additional multiplier t. From this, Vp =—Ate'+ 
Ae“! and y, = Ate~' — Ae~' — Ae~'. Substituting in (4.4.10), it now follows 


(Ate! — 2Ae~') — (Ate~') = 16e™! 


Rearranging and simplifying this last equation in order to compare like 
coefficients of e~' and te~‘, we see that the terms involving te~‘ drop out 
and we are left with 


—2Ae~' = 16e~* 
so that A= —8 and yp = —8te~'. 
We therefore have shown that the general solution is 


t 


Y=Int+Yp = cre’ + Qe * — Bte™ 


The preceding example shows that if the form of the forcing function matches 
the form of one or more parts of the complementary solution y;,, then we have 
to use a different, more complicated guess for yp than the most natural one. One 
more example will be helpful before we make some general conclusions. 


Example 4.4.5 Find the general solution of 
yl" —y' =4t (4.4.12) 


Solution. From the characteristic equation r? 


homogeneous equation, we quickly deduce that 


—r=0 for the corresponding 


Yh= +e" 


Next, since f(t) = 4t, we naturally guess that yp is a first order polynomial: 
Yp = at + b. From this, y, = a and y, = 0. Substituting in (4.4.12), we find 


O0-—a=4t 


Clearly, there is no value of a that makes —a = 4¢t for all values of t, so there can 
be no particular solution yp of the form yp = at + b. From another perspective, 
we can see why this must be true by observing that the “b” in our guess for yp 
is already part of the complementary solution since any constant function is a 
solution to y” — y’ =0. 

Therefore, we revise our guess for yp and assume it has form yp = t(at + b) = 
at* + bt. Doing so, we now have Yp = 2at + b and Vp = 2a, so substituting 
in (4.4.12) it follows 


2a — (2at + b) =4t 
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Rearranging so that we can equate like coefficients, we have 
—2at+ (2a—b)=4t 


so —2a = 4 and 2a— b = 0. It follows that a = —2 and b = —4, and thus 
Vp= —2t* — 4t. Therefore, we have found the general solution of (4.4.12) to be 


y=a+oe! —2t?—4t 


From our work with examples 4.4.1—4.4.5, we observe that the method of 
undetermined coefficients breaks down into two fundamental cases 


Case 1. No functions in the assumed particular solution yp are also 
solutions to the associated homogenous differential equation. 

Case 2. A function in the assumed particular solution yy is also a solution 
of the associated homogeneous differential equation. 


Moreover, we can observe that when the forcing function f(t) is a sum 
of polynomial, exponential, and sine and cosine functions, the linearity of the 
differential equation allows us to guess a form for yp that is an appropriate sum 
of all the different types of functions represented. The following example shows 
some of the variety that arises in choosing the form of yp. 


Example 4.4.6 Write an appropriate guess for yp for each of the following 
equations. Do not solve for the unknown coefficients. 


(a) y’ + y =4e7* + 52? 

(b) y” — 5y' — 6y = 3e7*! + 4cos3t 

(c) y’ —2y' + 5y = 3te! 

(d) y” —4y’ —5y =3e*' sint 
Solution. 


(a) The forcing function f(t) = 4e*! +. 5t? combines an exponential function 
and a second degree polynomial, so we would guess that 
Yp = Ae*! + bt? + ct +d. 


(b) The natural guess is yp = Ae~*! 4+ Bcos3t + Csin3t to account for the 
exponential and trigonometric functions present. 


(c) f(t) = 3te! is a product of a linear function and an exponential one. Its 
derivatives will be sums of functions of the same form and constant 
multiples of exponential functions, so we assume that 
Yp = Ate’ + Be’ = e'(At +B). 


(d) We observe that every derivative of f(t) = 3e*' sin t is the sum of functions 
of the form Ae”! cost + Be*! sin t, so that we would guess that 
Yp = Ae?! cost + Be” sin t. 
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Note the general rule we are using in case 1 and example 4.4.6: provided the 
terms of f(t) do not belong to y;, the form of yp is a linear combination of all 
linearly independent functions that are generated by repeated differentiation of 
the forcing function f(t). 

For dealing with equations that fall into case 2, we make a guess yp that is 
a sum of functions similar to those present in f(t). We then have to tack on 
powers of t to modify any parts of yp, that already appear in yj. In particular, 
we use the rule that if any part of yp contains terms that duplicate terms in yp, 
then we must multiply that part by t” using the smallest possible value of n to 
eliminate the duplication. 

For example, if we wanted to solve y” + 4y’ +4 = 3e 7", which has 
characteristic equation r? + 4r + 4 = (r+ 2)* = 0, our work in section 4.3 
implies that 


Vn = C1 et +c te >! 


Therefore, for the form of yp, which we initially might assume to be yp = Aes 
we see that we must in fact introduce a multiplier of t? in order to ensure that 
Yp does not appear in y;,. Thus, the appropriate form of yp is yp = At?e~*". 

A few more examples of the possibilities that arise in case 2 are useful. 


Example 4.4.7. Write an appropriate trial solution y, for each of the following 
examples. Do not solve for the unknown coefficients. 


(a) y’-y=4e'+5e? 
(b) y” +4y = 4cos2t 
(c) y" —2y'+y = 3te" 


Solution. 


(a) Observe from the characteristic equation r* — 1 = 0 that y, = cje’ + @e~', 


so both parts of the forcing function appear in y,. We therefore assume 
that yp = Ate’ + Bte~'. 


(b) The characteristic equation is r? +4 = 0 with roots r = £23. It follows that 
Yn = 1 Sin2t + cy cos2t. Since cos2t appears in the forcing function, and 
both sin 2¢t and cos2t arise in y;,, the appropriate guess for yp is 
Vp = At cos2t + Btsin2t. 


(c) Note that the characteristic equation is 17 —2r+1=(r—1)? =0s0 that 
Yn = ce’! + ete’. Since te‘ is included in yp, this implies that we must 
choose yp = Ate’. 


Obviously the method of undetermined coefficients requires us to be experi- 
enced with a wide range of examples and to understand how the derivatives of 
the forcing function behave. The exercises at the end of this section will provide 
further practice in this regard. 
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4.4.2 Variation of parameters 


Recall that we are focusing on solving the nonhomogeneous linear second-order 
equation 


y+ ay’ + ay = f(t) 


While the method of undetermined coefficients works well for a reasonable 
collection of forcing functions, it has some fairly strict limitations. In particular, 
it is unclear whether it is possible to make a reasonable guess for yp in order to 
solve an equation suchas y” + 4y’ — 5y = Int. In fact, we cannot: the derivative 
of the logarithm function is not a logarithm, and this is the main issue that 
prevents the use of this method.! 

Here, we study a method that will enable us, in theory, to solve a much 
wider class of nonhomogeneous linear second-order equations; as always, the 
approach requires us to find the general solution to the related homogeneous 
equation first. 

Let us again consider the equation 


y+ ay’ +aoy = f(t) (4.4.13) 


where ap and a; are constant and assume only that f(t) is continuous. Suppose 
we know that yi (t) and y2(t) are linearly independent solutions of the associated 
homogeneous equation, so the complementary solution is yy, = c, y(t) + @y2(t). 
In the method of undetermined coefficients, we made a guess of a particular 
solution yp to (4.4.13) based on the form of f(t). In the method of variation of 
parameters, we assume instead that the form of yp is a more complicated version 
of y;,. In particular, we assume that yp has the form 


Vp = u(t) (t) + u2(t)y2(t) (4.4.14) 


for unknown functions 4; and 1%, where again y; and y are the functions that 
arose in solving the related homogeneous equation. 

The goal of variation of parameters is to find the functions u(t) and u(t) 
such that the function yp = uy; + u2y2 isa particular solution to (4.4.13). Let us 
explore what conditions u(t) and u2(t) must satisfy. Differentiating yp yields 


Vp = My, + uy + ways + uy (4.4.15) 


While it seems natural at this point to differentiate again to find Yp and substitute 
into the differential equation, this becomes rather complicated. 

Above we have seen that the two unknown functions must satisfy one 
condition (so far), that being the differential equation itself, as stated in (4.4.13). 
Because we have two functions, we have the freedom to set a second condition 
as well. In order to make the functions as simple as possible, and to eliminate 


' If we tried the guess yp = Alnt, then Yp = A/t; which introduces a function of an entirely new 


form. If we tried yp = Alnt + B/t, then the derivative leads us to a function involving 1/ t?, again 
of a form not considered. 
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the second derivatives of u,; and u from arising in Yp , we impose a second 
condition given by 

uyyi + usy2 =0 (4.4.16) 
Observe now that by substituting the condition (4.4.16) in (4.4.15) we have 

Yp = my, + wry) 


so that 
Wy 


Vp = uy + uy, + wy; + ugyy 
Substituting the above expressions for Vp and Vp in (4.4.13) yields 
(uy + uy + yy + bys) + ar (uy; + u2yy) + ao(uiyi + wyr) = f(t) 
(4.4.17) 
Reorganizing (4.4.17) according to the terms u1, 2, uj, and wu), we have 
u(y + ary; + aoyi) + ualyy + ary; + aoyr) + (uy, + ubys) = f(t) (4.4.18) 
Now, at this point we recall that y; and y2 are fundamental solutions to 
the associated homogeneous equation y” + a,y’/ + ay = 0, which shows that 
in (4.4.18) the coefficients of both u; and uw are zero. Therefore, (4.4.18) 
reduces to 
uy, + ubys = f(t) (4.4.19) 
Combining conditions (4.4.16) and (4.4.19) results in the system of linear 
equations in u and u}, given by 
yin, + you; = 0 
Yiu tyyuh = f(t) 
To solve for u, and wu}, we multiply the first equation by y, and the second 
equation by y2, which gives 


yyy + yyru = 0 


(4.4.20) 
yyim + yayhuy = yof 
Subtracting the second equation from the first in (4.4.20), we have 
Wy — yyw = —yof 
and therefore 
gor (4.4.21) 
Y2Y\ — Viy2 
Using similar algebra to solve for u,, we may show that 
us vif (4.4.22) 


NWA PY 
Finally, to determine 4; and uy, we integrate to find 
t 
uy = / Vf dt and w= / i) dt (4.4.23) 
Y2Y\ — Viy2 Viy2 — y2Y) 


Once we integrate in (4.4.23) to solve for u; and uw, we can conclude that 
a particular solution yp to the original nonhomogeneous linear second-order 
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differential equation is yp = uy y; + u2y2 where y;, = cy; + C2. Examples will 
be helpful to demonstrate the key steps of this method. First, we state the formal 
result proved by our discussion above. 


Theorem 4.4.2. (Variation of Parameters Method) For the differential 
equation y” + ayy’ + ayy = f(t), where f is continuous, assume that y, and y2 
are linearly independent solutions of the corresponding homogeneous equation 
y” + ay’ + my = 0. Then, a particular solution to the non-homogeneous 
equation is ¥p = uy) + u2y2, Where uw and wp satisfy 


Y2Y| — VY2 Vy, — y2Y) 


Example 4.4.8 Solve the differential equation 
y" +y =sect (4.4.25) 


where we assume that = <t< 7 


Solution. We first observe that the corresponding characteristic equation is 
r? + 1=0 so that the complementary solution is y, = c. cost + ~sint. In 
particular, yj = cost and y2 = sint. 

We now seek two functions u(t) and u(t) that satisfy the equa- 
tions (4.4.24). Since y; = cost and y2 = sint, it follows that y; = —sint and 
V5 = cost, and therefore, we have 


intsect 
uy -([— at = | = 5 dt 
V2, —Vy2 — sin“ t —cos<t 


. sint 
=~ f sintsect dr =~ [ °° ds =In(cost) 
cos t 


t t 
m= | nf -at = f se esc — dt 
Vin — Y2Y1 cos* t+ sin* t 


= fia=s 


Note that we have used the fundamental trigonometric identity sin? t + cos” t = 
1 as well as other standard trigonometric relationships such as sec t = 1/cost. 
Also, since we are seeking any two functions u, and uz that satisfy (4.4.24), it is 
not necessary to include the constants that can arise in integrating. 

Hence we have found that u; = In(cost) and uw) = t. This enables us to 
conclude that a particular solution to the equation (4.4.25) is 


and 


Vp = u1y1 + Uy2 = In(cost) cost + tsint 
and, therefore, the general solution is 
Y=Yn+Yp = Cost +  sint + In(cost)cost + tsint 
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Example 4.4.9 Solve the equation 
yl +4y +4y=e 7 Int (4.4.26) 
Solution. To begin, we solve the associated homogeneous equation and get 
Vh = ae +o te?! 
Thus for variation of parameters, we assume that 
Yp = uy (the! + uy(t)te77# 


and we seek uy and up. Since y; = e~*! and yy = te~", it follows that y; = —2e~?" 
and y, = —2te~*! + e~', and therefore by (4.4.24) 


; | —_— | te~**(e~7" Int) a 
1 ny — ys te~2t(—2e—2t) — e—2#(—2te-2t 4 e-24) 


te~** In(t) 1 1 
= dt=— | tlhntdt =—-?t? Int+-—?? 
fae / ” 2 ” 4 


—2t —2t] t 
m= | yf ;at= f = ome dt 
VV — V2 e~**(—2t+1+2t) 


= finede=tine—t 


and 


From these expressions for u and 2, we can conclude that the overall form of 
the solution y to (4.4.26) is 


Y=Vh+ Vp 


1 1 
=qe+ote i+ (-jrines i") es (tIlnt— t)te?# 


1 
=ce + ote 7? + gee in t—3) 


Exercises 4.4 In exercises 1-10, determine the complementary solution yy, 
and state the general form of yp that you would guess in applying the method of 
undetermined coefficients. 


1. y” —y' —12y = 10e*! 
2. +y' —2y =4t7 -1 
3. y’—y=lle* 

4. y" + 3y’ = 3sin2t 
5.y" = 0? +3 
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6. y+ 4y' +3y = 2t+4cost 
7.y" +4y' +4y=0? 
8. y+ 4y =2sin2t 
9, y’ +4y = 20e' cost 
10. y’+y'-y=3 


In exercises 11-20, solve the stated IVP using the method of undetermined 
coefficients. Note that the complementary solutions y;, and appropriate guesses 
for yp were found in the corresponding exercises 1-10. 


ll. y”—y’—-12y=10e", y(0)=2, y/(0)=-1 

12. y’+y'-2y=4t?-1, y(0)=1, y'(0)=1 

13.y”-y=l1le', y(0)=-3, y'(0)=2 

14. y"+3y’ =3sin2t, y(0)=0, y/(0)=0 

15. y”=t?+3, y(0)=-2, y/(0)=-2 

16. y’+4y'+3y=2t+4cost, y(0)=2, y'(0)=0 

17. y’+4y'+4y=17, y(0)=5, y(0)=3 

18. y”+4y=2sin2t, y(0)=1, y'(0)=-1 

19. y”+4y=20e' cost, y(0)=0, y/(0)=-1 

20. y"+y'-y=3, y(0)=-1, y/(0)=-1 

In exercises 21-27, find the general solution of the given differential equation 
using variation of parameters. 

21.7 $y=tnt,, => <t<F 
22. y’ + 5y’ + 4y = te’ 

23. y” + 4y' +4y = te?# 
24.y"+y=csct, O<t<z 
25. y”—2y’'ty=£, 1>0 
26. y"” —4y’+4y =e! 

27. y"+y' -—6by= ea? t>0 


28. For a forced spring-mass system with c = 2, k = 1, m= 1, and 
F(t) = 20cos2t, determine the displacement of the mass at time t if the 
system is set in motion by the initial conditions y(0) = 2, y’(0) = 1. Sketch 
the solution and discuss the long-term behavior of y;, and yp separately 
and how these together influence the long-term behavior of the system. 
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29. For a forced undamped spring-mass system with k = 8, m= 2, and 
F(t) = 5cos2.1t, determine the displacement of the mass at time t if the 
system is set in motion by the initial conditions y(0) = 2, y/(0) = 1. Sketch 
the solution and discuss the long-term behavior of y;, and yp separately 
and how these together influence the long-term behavior of the system. 


30. For a forced undamped spring-mass system with k = 8, m = 2, and 
F(t) = 5cos2t, determine the displacement of the mass at time t if the 
system is set in motion by the initial conditions y(0) = 2, y’(0) = 1. Sketch 
the solution and discuss the long-term behavior of y;, and yp separately 
and how these together influence the long-term behavior of the system. 


31. For an RLC circuit with external voltage source E(t) = 100sin20t, L = 10, 
R= 40, and C = 1/40, determine the current at time ¢ given the initial 
conditions I(0) = 100, I’(0) = 25. Sketch the solution and discuss the 
long-term behavior of the current. 


32. For an RLC circuit with external voltage source E(t) = 50cos40t, L = 10, 
R= 40, and C = 1/50, determine the current at time ¢ given the initial 
conditions I(0) = 100, I’(0) = 25. Sketch the solution and discuss the 
long-term behavior of the current. 


33. Let 
y" + ay’ + ay= f(t) (4.4.27) 


be a second-order nonhomogeneous linear differential equation with 
constant coefficients. If yj, is the general solution to the homogeneous 
equation y” + ay’ + ayy = 0 and yp is any solution to the 
nonhomogeneous equation (4.4.27), show that y = yp, + yp is 

a solution to (4.4.27). 


4.5 Forced motion: beats and resonance 


Based on our work with second-order differential equations, we are now able 
to completely solve the damped harmonic oscillator equation for a variety of 
forcing functions. In particular, we are able to determine the general solution of 
the spring-mass system equation 


fig ope KB 1 
y+ —y'+ y= —FIt) (4.5.1) 
m m m 


by finding complementary and particular solutions. In this section, we explore 
some interesting phenomena related to periodic forcing functions F(t). Our 
work will have important consequences for the study of other applications 
modeled by similar differential equations, including RLC circuits. 

We begin by considering a sequence of related examples. As always, we 
assume that the units on all constants and related quantities are consistent. 
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Example 4.5.1 Determine the unique solution to the initial-value problem 
given by an undamped spring-mass system with m = 1 and k = 4 where F(t) = 
cost. Assume that the mass is initially released after being displaced 0.5 from 
equilibrium. Plot the solution and discuss its long-term behavior. 


Solution. Using the given information in (4.5.1), we see that the system is 
modeled by the initial-value problem 
y"+4y=cost, y(0)=0.5, y'(0)=0 (4.5.2) 


Solving the associated homogeneous equation y” + 4y = 0 provides the 
complementary solution y, = c,cos2t + c)sin2t. Applying the method of 
undetermined coefficients with the assumption that yp has the form 


Vp = Acost + Bsint 
we find upon substituting in (4.5.2) that A and B must satisfy the equation 
(—Acost — Bsint)+4(Acost+Bsint) = cost 

Equating coefficients of cost and sin t, it follows 

—-A+4A=1 

—B+4B=0 
Therefore A = 1/3 and B=0, so yp = i cost is a particular solution to (4.5.2). 
The general solution to the differential equation is 


1 
LPT Tp = GO OSAE Osa = cost 


Finally, we use the stated initial conditions y(0) = 1/2 and y’(0) = 0 to determine 
the values of c; and c). The first condition implies that 1/2 = c, + 1/3 and 
therefore c, = 1/6. Similarly, y’(0) = 0 implies that 0 = 2c, and thus c, = 0. 
Hence the solution to the IVP is 


1 1 
= —cos2t+—cost 
y 6 3 


In figure 4.3 we observe that the mass exhibits somewhat unusual behavior 
when negatively displaced due to the impact of the forcing function. With the 
undamped system and periodic forcing function, the observed behavior will 
repeat indefinitely. 


We next explore how slight changes in the forcing function can result in 
substantially different behavior for the system. 


Example 4.5.2 Determine the unique solution to the initial-value problem 
given by an undamped spring-mass system with m= 1 and k = 4 where F(t) = 
cos1.75t. Assume that the mass is initially released after being displaced 0.5 
from equilibrium. Plot the solution and discuss its long-term behavior. 
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0.5 


Figure 4.3. The solution y to the IVP in example 4.5.1. 


Solution. Similar to our work in example 4.5.1, we see that the system is 
modeled by the initial-value problem 


y"+4y=cosl.75t, y(0)=0.5, y/(0)=0 (4.5.3) 


Because only the forcing function has changed, the complementary solution is 
again y, = c, cos2t + ~ sin2t. Using the method of undetermined coefficients 
with 

Yp = Acos1.75t + Bsin 1.75t 
it follows that A and B must satisfy the equation 


49 49 
(- Faces 1.75t — Tce 173t) + 4(Acos1.75t+ Bsin1.75t) = cos1.75t 
Equating like coefficients, we can deduce that A = is and B = 0 so that 


16 
Yp= 5 cos 1.75t 
and the general solution is 
y =c,cos2t+osin2t+ - cos1.75t 


Applying the initial condition y(0) = 1/2 shows that c) = —17/30. In 
addition, y’(0) = 0 implies that c = 0. Hence the solution to the initial-value 
problem (4.5.3) is 


zl 2t+ = 1.75t 
=— = cos —cosl. 
Y= 30 15 
When we plot this solution, as shown in figure 4.4, we observe that while the 
solution is again periodic, in this instance there is an interesting pattern in which 
the amplitude of oscillation itself rises and falls. Because the system is undamped, 


Forced motion: beats and resonance 303 


‘dally 
TT 


Figure 4.4 The solution to the IVP in example 4.5.2. 


this behavior will repeat indefinitely. More importantly, observe how much the 
amplitude has increased in this example with f(t) = cos 1.75t as compared to 
what we saw in figure 4.3 with f(t) = cost: the amplitude of the solution in 
figure 4.4 is roughly 3 times that of the solution in figure 4.3, under identical 
other initial conditions. 


In the solution 


uv 2t+ 1.75t 
= —— cos —cosl. 
- 30 15 


to (4.5.3), we observe that we are adding two cosine functions of different 
frequencies to one another. In particular, these two frequencies are quite close to 
each other due to the coefficients “2” and “i” This results in the two functions’ 
amplitudes sometimes reinforcing each other (such as when both amplitudes 
are large and positive), while at other times their amplitudes negate each other. 
The visual periodic phenomenon seen in figure 4.4 is known as beats. This is 
because the overall wave with the large wavelength appears as a beat and can 
often be heard when two sound waves have approximately the same frequency, 
such as when two instruments are out of tune. We will explore this phenomenon 
from a more rigorous, algebraic perspective shortly. 

In the following example, we consider the case when the forcing function’s 
frequency exactly matches that of the general solution to the corresponding 
homogeneous equation. Again, only a slight change to the forcing function will 
be made when compared to our work above. 


Example 4.5.3. Determine the unique solution to the initial-value problem 
given by an undamped spring-mass system with m = 1 and k = 4 where F(t) = 
cos2t. Assume that the mass is initially released after being displaced 0.5 from 
equilibrium. Plot the solution and discuss its long-term behavior. 
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Solution. As above, we see that the system is modeled by the initial-value 
problem 

y"+4y=cos2t, y(0)=0.5, y’(0)=0 (4.5.4) 
and the complementary solution is again y;, = cj cos2t + c)sin2t. Here, we 
observe that the forcing function is one of the linearly independent fundamental 
solutions present in y,. Therefore we must make the modified guess that 


Vp = At cos2t + Btsin2t 
in our attempt to find a particular solution. With 
Vp = Acos2t — 2Atsin2t+ Bsin2t + 2Btcos2t 


and 
Vp = —4Asin2t — 4Atcos2t + 4Bcos2t — 4Btsin2t 


we can substitute into (4.5.4) to see that A and B must satisfy the equation 
(—4Asin2t—4Atcos2t+4Bcos2t—4Btsin2t)+4(Atcos2t+ Btsin2t) =cos2t 
All terms involving t cos2t and tsin2t drop out, leaving us with 

—4Asin2t + 4Bcos2t = cos2t 
from which it follows that B= 1/4 and A = 0. Hence yp = ftsin2t and thus 


1 
Y=Vnt Vp = C1 cos2t + c sin2t + genet 


Using the initial conditions y(0) = 0.5, y/(0) = 0, we can show that c; = 1/2 
and c, = 0. Therefore, the solution to the IVP is 


1 1 
= —cos2t+ —tsin2t 
y 2 4 


A plot of this solution is shown in figure 4.5; observe the striking behavior that 
the solution not only oscillates periodically, but that its amplitude grows without 
bound as t > oo. 


When we encounter the phenomenon in figure 4.5 where the solution to the 
harmonic oscillator initial-value problem grows without bound, we say that 
resonance occurs. This situation arises whenever the forcing function is a sine or 
cosine function whose frequency matches the natural frequency of the associated 
undamped homogeneous equation. In this case, the forcing function amplifies 
the natural oscillations of the system and causes them to grow without bound. 
In actual physical applications, such unbounded resonance is not realistic since 
either damping is present to limit the amplitude, the function is no longer a 
reasonable model for the phenomenon being modeled, or the structure simply 
fails. Large-amplitude oscillations do occur when forcing functions are close to 
or at the natural frequency of a structure or device, such as when the frequency 
of vortex shedding equals the natural frequency of bridge cables. 
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Figure 4.5 The solution to the IVP in 
example 4.5.3. 


Our work from examples 4.5.1—4.5.3 can be generalized to the situation 
where the constants k and m present in (4.5.1) are arbitrary. In particular, for 
the undamped, undriven spring-mass system given by 


i k 
y +—y=0 (4.5.5) 
m 


Ik Ik 
Yh = c cos,/ —t+csin,/—t (4.5.6) 
m m 


Since the mass will undergo one complete cycle as t goes from 0 to 277,/ m/k, the 
period of oscillation is 2 ,/m/k. The number of cycles per second, or frequency, 


is the reciprocal of the period, or /k/m/(2z). The angular frequency wo, which 
is measured in radians per second, is given by 


k 
oOo = 1 — 
m 


This leads us to write the solution (4.5.6) to the equation (4.5.5) in the form 


the general solution is 


Vh = C1 COSM@ot + c2 sINwot 


For the undamped spring-mass system driven by the periodic forcing function 
F(t) = Focosat, 


nk 1 
y +—-y=—hocosot (4.5.7) 
m m 


the method of undetermined coefficients can be used to show that 


Fo P 
= —,.—— cosw 
Mp m(a, — 02) 


provided that w 4 wo. In this case, the general solution to (4.5.7) is 
Fo 


2 


coswt 4.5.8 
m(@p — 7) ( ) 


y=, Coswpt + ~sinw@gt + 
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When w and wp are nearly equal, the system demonstrates near resonance as the 
phenomenon of beats occurs. 

In the case where w = wo, the solution (4.5.8) obviously fails to hold; using 
work that generalizes example 4.5.3, it can be shown that a particular solution 
to (4.5.7) takes the form 


= tsin wot 
Yp 2mao 
which produces the general solution 
: Fo _., 
¥Y =, COSMpt + ~ sinwot + tsina@ot (4.5.9) 
2m@o 


In the yp term in (4.5.9), we see how the solution grows without bound as 
t> ow, 

Having now discussed the phenomena of beats and resonance for undamped 
systems, we now briefly consider the situation where damping is present. Above 
we have observed that if @ © wo, then beats or resonance can occur. Regardless of 
the comparison of the frequencies of the system itself and the forcing function, 
a periodic forcing function will lead the system to oscillate indefinitely. The 
most important issue to understand is how large those oscillations can grow; 
this is especially critical for applications to vibrations and oscillations in physical 
structures such as bridges. 

Two examples will be discussed to show the impact that different levels of 
damping can have on such a system. 


Example 4.5.4 Determine the unique solution to the initial-value problem 
given by the damped spring-mass system with m= 1, c=0.1, and k = 4, where 
F(t) = cos2t. Assume that the mass is initially released after being displaced 0.5 
from equilibrium. Plot the solution and discuss its long-term behavior. 


Solution. The system described above is modeled by the initial-value problem 
y" +0.1y +4y=cos2t, y(0)=0.5, y'(0)=0 (4.5.10) 
The characteristic equation is r? + 0.1r + 4 = 0, whose roots are approximately 
1 
r=——+1.9991 
20 
Thus, the complementary solution to (4.5.10) is 
Vh=e 0° (c, cos 1.999t + c) sin 1.999f) 


Undetermined coefficients can be used in the usual way with the guess yp = 
Acos2t+ Bsin2t to find that A= 0 and B=5 so that Vp = 5sin2t. Thus, the 
general solution to the given differential equation is 


= ¢~ 20" (cq, cos 1.999t + q sin 1.999t) + 5sin2t 
y 
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Figure 4.6 The solution to the IVP in example 4.5.4. 


Applying the initial conditions, we can find the values of c; and c) to show that 
the solution to the stated IVP is 


vz bg 


1 
y=e (;<0s1.9991 —4.989sin1.9991) +5sin2¢ (4.5.11) 


In figure 4.6, we see the plot of this solution over the interval [0,307] and 
we observe that initially the amplitude of oscillations grows, much as it did 
in example 4.5.3 where resonance occurred. Here, however, we have a small 
amount of damping present in the system. Over time, this limits the amplitude 
of oscillations and keeps them from growing without bound, though such large- 
amplitude oscillation can result in damage to physical structures. 


In the solution (4.5.11) to the IVP (4.5.10), we observe two very different 
behaviors in the complementary and particular solutions. Due to the presence 
of e*/20 in Yh» we see that as t — 00, yp(t) > 0. In contrast, yp = 5sin2t 
will oscillate continuously between —5 and 5. Because this is the behavior the 
system will tend to over time, we call yp the steady-state solution. The solution 
yp is called the transient solution, and is significant only for relatively small 
values of t. 

Intuitively, increasing the damping that is present should decrease the 
amplitude of oscillations generated by a periodic forcing function. In 
example 4.5.4, a forcing function with amplitude 1 generated oscillations in 
the system that increased to an amplitude of nearly 5, in part due to the small 
damping constant, as well as the frequency of the forcing function which nearly 
matched the natural frequency of the system. In our next example, we increase 
the amount of damping present to see how this limits the size of the waves 
generated in the solution. 


Example 4.5.5 Determine the unique solution to the initial-value problem 
given by the damped spring-mass system with m= 1, c= 4, and k = 4 where 
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Figure 4.7 The solution to the IVP in 
example 4.5.5. 


F(t) =cos2t. Assume that the mass is initially released after being displaced 0.5 
from equilibrium. Plot the solution and discuss its long-term behavior. 


Solution. In this final modification of the spring-mass system we have studied 
in examples 4.5.1—4.5.4, here we have only increased the damping constant so 
that the system is modeled by the initial-value problem 


y"+4y' +4y=cos2t, y(0)=0.5, y(0)=0 (4.5.12) 


t 


At this point in our work, we can show that y;, = ce 2! + ete * and Vp = 


7 sin2t. Applying the initial conditions, the solution to the IVP (4.5.12) is 
1 3 1 
y= ae + zie + 3 sin2t 


Plotting this solution, as shown in figure 4.7, we observe that the amplitude 
decreases almost immediately because the complementary solution y, = 
set + 3 te?! vanishes quickly; moreover, only small steady-state oscillations 
persist due to yp = 7 sin 2t. 


Exercises 4.5 In exercises 1—5, solve the given initial-value problem for y(t) 
if y(0) = y’(0) = 0, given the stated parameters for an undamped spring-mass 
system. In addition, determine the maximum displacement of the mass, state if 
beats or resonance are present, and sketch the solution. 


l.m=1, k=25, f(t) =0.01cos(5t) 
2.m=2, k=32, f(t)=2cos4t 
Smeal k=36,. FH) =2e" 
4.m=3, k=150, f(t) =0.6cos7t 
5.m=2, k=100, f(t)=4sin7t 
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6. A 2-kg mass is suspended from a spring with k = 32. A force 
f(t) = 0.1sin 4t is applied to the mass. Calculate the time required for 
failure to occur if the spring breaks when the amplitude of oscillation 
exceeds 0.5. The motion starts at rest and there is no damping present. 
Assume consistent units. 


7. A 20-N weight is suspended from a frictionless spring with k = 98. A force 
of f(t) = 2cos7t acts on the weight; the motion starts at rest. Does the 
system demonstrate resonance, beats, or neither? Explain, including a plot 
of the solution, assuming consistent units throughout. 


In exercises 8-11, find the current I(t) for each simple series circuit (with no 
resistor) if [(0) = I'(0) =0, given the stated parameters for an undamped spring- 
mass system. In addition, determine the maximum current, state if beats or 
resonance are present, and sketch the solution. Assume consistent units. 


8.C=10-3, L=0.1, E(t) =120cos101t 
9.C=0.02, L=0.5, E(t) =10sin10t 

10.C=10-4,, L=1.0, E(t) = 120sin100¢t 
11.C=10-3, L=0.1, E(t) =240cos10t 


12. A forcing function f(t) = 50cos4t N is imposed on a spring-mass system 
for which m = 2, k = 8 N/m, and c = 2kg/s. Determine the amplitude of 
the steady-state solution. 


13. A forcing function f(t) = 10sin(2t) N is imposed on a spring-mass system 
that starts from rest for which m = 2 kg and k = 8 N/m. Determine the 
damping coefficient necessary to limit the amplitude of the resulting 
motion to a maximum of 2 m. 


14. A series circuit is composed of elements for which R = 60Q, 
L=107*H, and C = 107° £. Find the steady-state current if a 
voltage of E(t) = 120cos 120s t is applied. 


4.6 Higher order linear differential equations 


In the preceding sections of this chapter, we have focused on second-order 
linear differential equations. One reason we emphasize second-order equations 
is the importance of the (damped) harmonic oscillator equation. Moreover, 
second-order equations provide an appropriate setting in which to learn a 
variety of key ideas that may be generalized to linear equations of higher order. 
In this section, we consider several examples of higher order equations in 
order to gain exposure to important extensions of concepts we have already 
studied. 

We first consider an example to see the natural approach to a third-order 
equation. 
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Example 4.6.1 Find the general solution to the differential equation 


a 


Solution. For second-order linear homogeneous equations, we begin with the 
guess that y = e” and determine the values of r for which e” is a solution to the 
equation. Doing likewise for this third-order equation, we note that y/ = re”, 
y! =re™, andy” = re. Substituting into (4.6.1), we find 


ret —2re™ — re +2e7 =0 
Factoring, it follows 


e™[r?(r—2) —1(r —2)] =0 
or 
e"(r—2)(r? -1)=0 
We therefore see that the r-values for which y = e” is a solution to (4.6.1) are 
r=—l,1, and2. 
Using reasoning similar to our work with second-order equations, we now 


expect that the solutions y; = e~', y) = e', and y3 = e” are linearly independent 
and that the general solution is the linear combination 


y=cje ' +e! 4+ ce! 


Just as with second-order equations, we call the equation (r— 2)( 17—-1)=0 
that results from the guess y = e™ the characteristic equation. Roots of the 
characteristic equation play a central role in determining solutions to higher 
order equations. Furthermore, example 4.6.1 hints at the fact that we can 
expect several important theoretical results from second-order equations to 
hold for equations of order n. We state these results, which are analogous to 
theorems 4.2.1, 4.2.2, and 4.2.3, without proof. Observe that we will use the 
notation y’” to represent the third derivative of y, but for any derivative of order 
higher than 3, we use the notation y'”). For example, y is the fifth derivative 
of y. 


Theorem 4.6.1 If yi, yo,...,¥% are solutions to the nth-order linear 
homogeneous equation 


y+ ayy (ty) +++ Fay (Dy! + ag(t)y = 0 


then y= c1y1 + Myo +--+: + cKyz is also a solution for any constants c1,..., Ck. 


From theorem 4.6.1, we expect that linear combinations of fundamental 
solutions will play a key role in our solution to higher order equations. In 
addition, for corresponding initial-value problems, we are again guaranteed the 
existence of unique solutions under sufficiently nice conditions. 
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Theorem 4.6.2 Consider the nth-order initial-value problem given by 
y+ ani(t)y") +++ + ar(t)y’ + ao(t)y = f(t) 
y(t) = bo, y/(t) = bi, y(t) = Ona 


where the coefficient functions aj(t) and the forcing function f(t) are 
continuous on an open interval (a, b). Given any f in (a,b), (4.6.2) has a 
unique solution in (a, b). 


(4.6.2) 


As in our earlier work with second-order linear DEs and systems of linear 
first-order DEs, in our current study of higher order differential equations, we 
usually consider the situation where the coefficient functions a;(t) are constant. 
In this setting, we can deduce the following result. 


Theorem 4.6.3. The set of all solutions to the second-order homogeneous 
linear differential equation y“ + a,—yy\"-) + --- + ary’ + aoy = 0, where 
a, .--,; 4y—1 are constants, is a vector space of dimension n. 


From these three results, we see that we can solve any homogeneous linear 
differential equation of order n provided that we can find n linearly independent 
solutions to the equation. Moreover, given such a general solution, we can 
determine the unique solution to any corresponding initial-value problem. With 
second-order equations, we normally verified the linear independence of two 
solutions by confirming that they were not scalar multiples of one another. 
For sets of more than two functions, a more sophisticated tool, the so-called 
Wronskian, is necessary to test for linear independence. 


Definition 4.6.1 Suppose that 1, y2,..., ¥n are each (n — 1)-times differen- 
tiable functions on an interval [a, b]. The Wronskian W of these functions is 
given by 


V1 2 eee Vn 
vA Vy a! Dh, 
W(t) = det 
yf ee 


We emphasize that the Wronskian is itself a single scalar function of t. The 
most important feature of the Wronskian is that W(t) is identically zero if and 
only if the functions yj,...,¥, are linearly dependent. Hence, if W(t) is not 
identically zero, then the functions are linearly independent. We consider an 
elementary example to demonstrate the use of the Wronskian. 


Example 4.6.2. Use the Wronskian to show that the functions y; = e~', =e’, 
and y3 = e”! are linearly independent. 
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Solution. From the definition, observe that 
W(t)=det|—e~' e! 2e7* 


Computing the determinant, we find 

W(t) =e '(4e! e*! — 2674!) — ce! (—4e* e*? — 267 e*) 4 0# (ee! — ee) 
= 2e'e** + Gee! — 2e7# 
_ 6e2! 


Since W(t) 4 0, the functions y; = e~', y) = e’, and y3 = e”! 


independent. 


are linearly 


Using the Wronskian, it can be shown that if the characteristic equation of a 
homogeneous linear differential equation of order n has n distinct, real solutions 
T,-++, Tn, then the corresponding functions y; = e"',..., yn =e are linearly 
independent, and therefore can be used to form the general solution to the 
equation. In the cases where roots of the characteristic equation are repeated or 
complex, we use ideas similar to those encountered for second-order equations 
to find the required 7 real linearly independent solutions to the given differential 
equation. The next example examines this situation in the case ofa repeated root. 


Example 4.6.3 Determine three linearly independent solutions to the equation 
yl” — By" +3y' -y=0 (4.6.3) 


and hence state the general solution to the DE. 


Solution. The corresponding characteristic equation is 
r —3r°+3r—1=0 


Factoring, it follows that (r — 1)? = 0, so only one real, repeated root exists: 
r = 1. This shows that y; = e* is one solution to (4.6.3). Two more solutions 
remain to be found. Based on our experience with second-order equations 
and theorem 4.3.1, we naturally expect that y, = te’ and y3 = te! will be 
solutions to (4.6.3). It is a straightforward exercise to verify that each of these 
two functions is a solution to the given equation. Moreover, it may be shown that 
the Wronskian of these three functions is nonzero and, therefore, the functions 
are linearly independent, so the general solution to (4.6.3) is 


= cqe'+oteé't+oatte’ =(qtott+ar)e 
y 


The following result analogous to theorem 4.3.1 holds for repeated roots of 
multiplicity k in higher order equations. 
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Theorem 4.6.4 For any nth-order linear homogeneous differential equation 
of the form 

y+ any) +--+ ary! + any =0 
whose characteristic equation has a repeated root r of multiplicity k, the k 
linearly independent solutions of the differential equation corresponding to 


rare 


k-1 ort 


e", te wacegke e 


an i 2 ett 

We deal with complex roots of the characteristic equation in exactly the 
same manner as in the case of second-order differential equations. In particular, 
if r = a+ ib is a complex root of the characteristic equation, we consider the 


complex-valued function 
z(t) = (ati)! — eat gibt — 64 (cos bt + isin bt) 
The real and imaginary parts of the complex solution then form linearly 


independent solutions to the differential equation. Our next example illustrates 
this in detail. 


Example 4.6.4 Determine the general solution to the equation 


y) —2y!" + l4y” — 18y'+45y =0 (4.6.4) 


Solution. If we consider the characteristic equation r+ — 2r? + 14r? — 18r + 
45 = 0 and factor, we find 


rt — 29 + 149? — 187 +45 = (17 +. 9)(77 —2r +5) =0 


from which it follows that r = +31 and r = 1+ 2i. Thus one complex 
solution is 

z(t)= et — cos3t +isin3t 
so that y) = cos3t and y2 = sin3t are solutions to (4.6.4). Similarly, another 
complex solution is 


Z(t) = ellt2it — e' (cos2t + isin 2t) 


so that y3 = e' cos2t and y4 = e! sin2t are also real solutions to (4.6.4). We can 
now conclude that the general solution to the given differential equation is 


y=c,cos3t+cosin3t+ ce! cos2t + cae’ sin2t 


The only remaining case to consider for homogeneous equations is that of 
repeated complex roots. In this case, just as with that of repeated real roots, we 
multiply the basic solutions that arise by powers of t to build additional linearly 
independent solutions. For example, if r = 1 + 27 is a repeated complex root of 
multiplicity two, the corresponding four real solutions would be y; = e' cos 2t, 
yn = e' sin2t, y3 = te’ cos2t, and y4 = te’ sin2t. 
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We also observe at this point that we can solve corresponding initial-value 
problems for any given nth-order homogeneous linear equation. Since the 
general solution of an nth-order equation has n unknown constants c),..., Cn 
we will need n initial conditions to uniquely determine their values. The 
following example demonstrates the solution of a standard problem. 


Example 4.6.5 Find the solution of the initial-value problem 


y)—y=0, y0)=y'(0)=y"(0) =y"(0) =1 (4.6.5) 


Solution. With characteristic equation r* — 1 = 0, it is straightforward to verify 
that the roots of this equation are +1 and +i so that the general solution to the 
DE in (4.6.5) is 


y(t) = ce’ +oe'+cocost+cqsint 
The derivatives of y are 
y'(t) = ce’ — me '—c3sint + cost 


*_ os cost—casint 


y"(th=cje'+oe, 
y(t) = ce’ — ge '+c3sint — cost 
Using the stated initial conditions, observe that 


y(0)=1l=q+o+c3 


y¥(0)=l=aq-o+4 
y"(0)=1l=aq+ao-G 
y"(0) = l= Cy -O—% 


Row-reducing this system of linear equations shows that the unique solution is 
given by c) = 1, © = c3 = c4 = 0, and therefore, the solution to the IVP (4.6.5) is 


y(tj=eF 


Finally, it remains for us to see how the previous methods of dealing 
with nonhomogeneous second-order equations extend to higher order equa- 
tions. Just as with second-order equations, we first solve the corresponding 
homogeneous equation using the approach discussed above to find the 
complementary solution y;,. Then, in order to find a particular solution yp to the 
nonhomogeneous equation, we can use extensions of the methods discussed in 
section 4.4. 

For the method of undetermined coefficients, the approach is essentially 
identical: based on the form of the forcing function f(t) and the presence of 
fundamental solutions within f, we make a reasonable guess of the form of a 
particular solution yp involving unknown coefficients. By substituting into the 
given DE, we determine values for these coefficients and hence yp. The general 
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solution is then y = y;, + yp. Variation of parameters may also be extended: 
given yp, = c1yi + Oyo +--+: + Guyn, we seek functions 1, t2,..., Up, such that 
Vp = Uyit+ yr +--+ + UnYn is a solution to the differential equation. This 
method is best understood through the theory developed for nonhomogeneous 
systems of first-order equations given in section 3.7 and reminding ourselves 
that any nth-order linear equation can be converted into a system of n 
first-order equations. While this approach provides a guaranteed particular 
solution in theory, the computational details are often very complicated. We 
therefore choose to focus on those higher order DEs that may be solved using 
undetermined coefficients. An example is instructive. 


Example 4.6.6 Determine the general solution to the equation 


yO —y" =3e4 4 (4.6.6) 


Solution. We first solve the corresponding homogeneous equation, y%) — 


y” = 0 to determine y,. Since the characteristic equation is r> — r? = 


r(r? — 1) =0, we see that y;, is given by 
Yr=atat+ot?+ce' toe 


For the nonhomogeneous equation (4.6.6), based on the form of the forcing 
function f(t), the natural form to assume for yp is 


Yp = Ae’ +B+Ct+Dt? 


However, since each part of our assumed form of yp appears in yp, we therefore 
modify our guess by multiplying by appropriate powers of t and assume 
instead that 


Yp = Ate' + Bt? + Ct*+ Dt? 
From this, we observe that to substitute yp into (4.6.6) we need to know y’” 
and y\), By repeated differentiation, 


Vp = Ate’ + Ae’ + 3Bi? + 4Ct* + 5Dt* 
yy = Ate! + 2Ae! + 6Bt + 12Ct” + 20D#° 
yp' = Ate! + 3Ae' + 6B+ 24Ct + 60Dt? 
yp = Ate! + 4Ae' + 24C + 120Dt 


yy? = Ate’ + 5Ae' + 120D 
Substituting in (4.6.6), it follows 
Ate’ + 5Ae!’ + 120D — (Ate’ + 3Ae’ +6B+24Ct + 60D?t*) = 3e' + 17 —4 


so that 
2Ae’ — 60D#? — 24Ct + 120D — 6B = 3e° +27 —4 
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Equating like coefficients, we find that A = 3/2, D= —1/60, C=0, and B= 1/3. 
Hence, we have found 
Se + a 
= —1fée = = — 
a a a, 


2 


Therefore, the general solution to (4.6.6) is 


_ 3 1 1 
Y=at+Vp = t+ ent +05? + ce’ + 05e “Eat tah ae 


Throughout this section we have seen that the approaches needed to solve nth- 
order linear equations are nearly identical to those we use for second-order 
equations. The main differences are that the characteristic equation is generally 
difficult, if not impossible, to factor, and we have to be especially cognizant of 
repeated roots in determining yp and yp. 


4.6.1 Solving characteristic equations using Maple 


While solving linear differential equations of order n requires nearly identical 
methods to DEs of order 2, there is one added challenge from the outset: solving 
the characteristic equation. The characteristic equation is a polynomial equation 
of degree n; while every such equation of degree 2 can be solved using the 
quadratic formula, equations of higher order can be much more difficult, and 
(for equations of degree 5 and higher) often impossible, to solve by algebraic 
means. 

Computer algebra systems like Maple provide useful assistance in this 
matter with commands for solving equations exactly and approximately. For 
example, say we have the characteristic equation 


r—P-—7r+r+6=0 
To solve this exactly in Maple, we enter 
> solve(r*4 - r°3 - 7*r°2 + r + 6= 0, 4x); 
Maple produces the output 


=1,1,=2,3 


showing that these are the four roots of the characteristic equation. 
Of course, not all polynomial equations will have all integer solutions, much 
less all real solutions. For example, if we consider the equation 


r+pP+rt+rt+1=0 
and use the solve command, we see that 


> solve(r*4 + r°3 + r°2 +r +12=0, 4X); 
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results in the output 
Ede Je Rei oy Sis =o eka sas 
4 4 4 "4 4.4 ; 
1 1 1 1 1 1 
—-V5-—-—-— -IV2V5-—V5,--+-vV5-— -IV 2/5 5 
; ey Sarre) giv 5+ 


In this case, we might prefer a decimal approximation to the roots rather than 
the exactness that Maple provides. One way to achieve this is to use the fsolve 
command: 


> fsolve(r*4 + r73 + r*2 +r + 1 = 0, 4, complex); 
which generates the result 


—0.80902 — 0.587791, —0.80902 + 0.587791, 0.30902 — 0.95106I, 
0.30902 + 0.951061 


Note that without the option “complex” in the fsolve command, the 
command will not generate any output. This is because the default setting for 
fsolve is to numerically approximate all of the real roots of the polynomial 
equation and to ignore complex ones. For polynomial equations of degree 5 
or more, the £Solve command is the appropriate tool to use to determine 
accurate approximations of the equation’s solutions. 


Exercises 4.6 In exercises 1-12, use the characteristic equation to determine 

the general solution to the given higher order linear homogeneous DE. 
1. yy” —2y"”—y'+2y=0 

Jy” —2y"” —3y’ =0 

. Ay” — 13y' — 6y =0 

_y4) — 13y" + 36y =0 

JV" +3y" +3y +y=0 

y) —y"—Ty" +y' +6y =0 

JV" —y" +4y' —4y =0 

9) —y=0 

©) —2y — y+ 2y =0 

9) 4+ 9y 4 24y" + l6y = 0 

4) 4 4y"” + by” +4y/ +y =0 

Yh) + 3y" +" —5y’ =0 


oo N Du fF WN 


= — 
No fF © 
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In exercises 13-22, solve the given IVP. 

13. 7" —4y'=0, y(0)=1, y'(0) =0, y"(0) =2 

14. y'" —3y"+2y'=0, (0) =0, y’/(0) =2, y”(0) =0 

15. y” —6y”+1ly’-6y=0, y(0)=0, y’(0)=2, y”(0) =0 

16. y) — 2y”—y" +2y/=0, (0) =2, y/(0) =0, (0) = 10, y”(0) =0 
17. yy" +y"+4y'+4y=0, y(0)=0, y’/(0) =10, y”(0) =0 

18. v9 4+5y”+4y=0, y(0)=4, y/(0) =0, y”(0) = 10, y’””"(0) =0 

19. y"=0, y(0)=2, y/(0)=0, y"(0) =2 

20. y¥ —16y=0, (0) =4, (0) =0, y"(0) =0, (0) =0 

21." —3y" +3y'—-y=0, y(0)=1, y/(0) =2, y"(0) =1 

22. yO +y"”=0, yO)=1, y(0)=0, (0) =2, y"(0) =0, y¥ 0) =4 


In exercises 23-28, construct a homogeneous linear differential equation of the 
least possible order that has the given function(s) as solutions. 


23; H= eG =e 

24, yy =P e** 

25. 71 =t, yo = cos3t, y3 =e * 
26. y, = te*' sint 


27.y, =e *? 


cost, y2 = sin5t 
28. y; = sint, y2(t) = tsint 
29. Find the general solution to y“ + 2y" + y = cost. 


30. Find a particular solution to y“) + 2y” + y = sint + 2cos t. How is your 
answer similar to the result in exercise 29? 


In exercises 31-42, use undetermined coefficients to determine the general 
solution to the stated nonhomogeneous equation. Note that each of the 
corresponding homogeneous equations has been solved in exercises 1-12. 


31. y'” —2y" —y +2y =2 

32, 9" = 2y" —3y' =2e! 

33. 4y’” — 13y' —6y =cost 

34, y\4) — 13y"4 36y =t 

35. ¥" + 3y" +3y/ +y =sint 

36. yO) — yy!” —7y" +y' +6y= 07 +3 
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37, yl" — y" +4y' —4y =e" 
38. y¥) —y=3t 

39, y) —2y4) — y/ 42y=7 

40. y) + 9y4) + 24y" + l6y = t? 

41. y4) + 4y"" + 6y” +4y' +y =t+cost 
42. y4) 4. 3y!" 4 y” —5y’ = 2t—sint +e! 


4.7 For further study 
4.7.1 Damped motion 
Consider the general form of the spring-mass equation 
my" + cy’ +ky =0 (4.7.1) 
where c £0 so that viscous damping is present. In what follows, we explore how 


the values of the constants m, c, and k affect the behavior of the solution y. Note 
that in this context, m, c, and k are always positive. 


(a) Show that the roots of the characteristic polynomial of (4.7.1) are 


ees —ctJc? —4mk 
-_ 2m 
(b) We examine the three possible cases for the roots of the characteristic 
polynomial: 


(i) Suppose that c* — 4km > 0. Explain why Vc? — 4mk < c and thus why 
both roots of the characteristic equation must be negative. State the 
general solution to the equation (4.7.1) in terms of the constants c, m, 
and k. 

(ii) Suppose that c? — 4km = 0. Discuss the number of real roots of the 
characteristic polynomial and state the general solution to the 
equation (4.7.1) in terms of the constants c and m. 

(iii) Suppose that c* — 4km < 0. Explain why both roots of the 
characteristic polynomial are complex. Using Q = /4mk — c?/(2m), 
state the general solution to the equation (4.7.1) in terms of the 
constants c, m, and Q. 


(c) The respective cases (i), (ii), and (iii) in (b) are typically called 
overdamping, critical damping, and underdamping. How is the case of 
underdamping significantly different from overdamping and critical 
damping? Explain both in terms of the algebraic form of the solution as 
well as in terms of the solution’s expected graph. 


(d) A 4-kg mass is suspended from a spring with constant k = 25, anda 
dashpot with various levels of damping viscosity is present. The mass is 
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displaced 0.5 m from its equilibrium and released. Determine the 
displacement y(t) of the mass if 


(i)c=15, (ii) c=20, (iii) c= 25, and (iv) c= 30 


In each case, state whether the system is overdamped, critically damped, or 
underdamped, and sketch the solution curve. 


(e) The case of underdamping is the most interesting of the three cases, for it is 
here that multiple oscillations through equilibrium occur. In (b) (iii), you 
should have shown that the general solution may be expressed in the form 


y= em (cy cos Qt+ @ sin Qt) 
Show that y may be alternatively expressed in the form 


y = Ae 2m' cos(Qt — 8) (4.7.2) 


where A = ,/ c + a and tan = c,/c. (Hint: Set Acos(Qt — 6) = 
c1 cos Qt + c2 sin Qt and equate like coefficients after using the 
trigonometric identity cos(a — 8) = cosa cos # + sina sin f.) 


(f) In the underdamped case, we are interested in how fast the amplitude of 
the oscillations decays to zero. In what follows, we show how the ratio of 
consecutive local maxima (or minima) of y(t) depends only on the 
constants c, m, and Q. 


(i) Using y = Ae™ anit cos(Qt — @) from (e), determine y’ and show that 


y’ = 0 if and only if 
c 


tan(Qt—6@)=— (4.7.3) 
2mQ 
(ii) If the solutions of (4.7.3) are denoted by t,,, then show that 
7 1 c nit 
i= ° + re arctan (-—x) + ry (4.7.4) 


Explain why we expect y(t,) and y(tn41) to be a local maximum and 
minimum (or local minimum and maximum), respectively, of y(t), 
and hence why y(t,) and y(t,+2) will be consecutive maxima or 
consecutive minima. 
(iii) Let y, = y(ty) and yy42 = y(t+2). Using (4.7.2), evaluate y(t,) and 
y(tn+2) and verify that 
Yn cos(Qt, — 0) 
Ynt2  COS(Qty+2 — A) 
(iv) Show that (4.7.3) implies 
(th — tht 2)Q = —20 


em tai (tn—tn42) (4.7.5) 


and thus 
Yn cos(Qt, — @) are/m 


= (4.7.6) 
Ynt2  COS(Qtyr+2 — 8) 
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(v) Show that 
Qt, —0 = Qtp4y2 -O — 20 
so that 
cos(Qty — 6) = cos(Qty42 — 8) 
Use this last result to prove that 
Yn _ orc/mo 
Yn+2 


(4.7.7) 


(g) The logarithm of (4.7.7), 


D=I1n = Ine™/m? = —_ (4.7.8) 
Yn+2 mQ 

is called the logarithmic decrement. Note that this quantity is independent 
of t as well as the initial conditions present in the underdamped case for 
the DE (4.7.1), and that the value of the logarithmic decrement tells us 
how rapidly consecutive oscillations diminish in the underdamped case. 
For each of the following underdamped spring-mass systems, determine 
the solution function y(t) and compute the logarithmic decrement. 
Explain how the value of the logarithmic decrement tells you whether 
oscillations will die out slowly or rapidly. Using a computer algebra system 
to execute the routine calculations is particularly appropriate here. In each 
case, assume the mass is displaced 1 m and released. 


(i) m=4,c=19,k =25 

(ii) m=4,c=10,k =25 

(iii) m=4,c=1,k=25 
)m=4,c=0.1,k=25 


(iv 
4.7.2 Forced oscillations with damping 
Consider the general form of the forced spring-mass equation 
my" + cy’ +ky=f(t) (4.7.9) 


where c > 0 so that viscous damping is present. Again, we remark that in this 
context m and k are always positive. 


(a) Show that if 
Vc? —4km 


2m 


Q= 


then the complementary solution of (4.7.9) is 
yn(t) = e~2m* (ce™ + Ge) (4.7.10) 
(b) Explain why 
lim y,(t) =0 


too 
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Recall that we call y;,(t) the transient solution. What does this tell us about 
the role played by the particular solution yp(t) in the general solution 
Y=Vh+Yp as t > co? 


(c) We now consider the effects of the periodic forcing function 
f(t) = Focoswt. With this function, we have seen that resonance is only 
possible when no damping is present; here, we wish to explore the impact 
of the parameters in f(t) on the steady-state solution yp to (4.7.9). 


(i) Use the method of undetermined coefficients to show that with 
f (t) = Fo cosot, the particular solution yp to (4.7.9) is 


Fo(k — ma”) 
k— ma’)? +@2c2 


Yp= , (coswr-+ ino) (4.7.11) 
(ii) As in our study of undamped spring-mass systems and resonance, 
we let wo = /k/m. Show that yp(t) may be equivalently expressed 
in the form 
Fo 


= ae Sey cos(wt — @) (4.7.12) 


Yp 


Compare the result to (4.7.2). 
(iii) Observe that the amplitude of the oscillation of yp in (4.7.12) is 
Fo 


A(@) = 
(a) m? (wp — w?)? + wc? 


(4.7.13) 


and that wo, m, and c are fixed constants determined by the given 
spring-mass system. We now examine how the size of these 
oscillations depends on w. 


First, compute 


dA 
dw 
Then, set dA/dw = 0 to show that the maximum amplitude occurs 
when 
2 7 
wo =a — ome (4.7.14) 


(iv) Explain why if c satisfies c? > 2m?ws, then there is no value of w that 
produces a maximum amplitude of oscillation. 
In addition, note that when a maximum amplitude exists (i.e., 
provided c? < 2m’), its value is given by A(w) where 
satisfies (4.7.14). Use this condition to compute A(q@) and show that 


2mFo 
Lae = (4.7.15) 


c,/4m2a —¢?2 
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(v) Consider a particular spring-mass system for which m= 1 and k =4 
where we consider various damping constants c. In addition, assume 
we apply the forcing function f(t) = coswt, so that Fo = 1. Recall that 


wo = Vk/m, so wo = 2. 


For each of the c-values c = 0.1, 1,2, 3, 4, 5, 6, plot the function 
Fo 
nr? (we _ w*)? + w2c2 


A(@) = 


on the interval @ = 0...10. When a maximum oscillation exists, where 
does it occur? How is the size of the maximum oscillation correlated 
with c and w? What should we ensure about the relationship between 
w and wo if we want to avoid large amplitude oscillations? 


(d) Complete the following exercises which examine the magnitude of 
oscillations in damped, driven spring-mass systems. 


(i) A forcing function f(t) = 10sin2t is imposed on a spring-mass system 
for which m = 2 kg and k = 8 N/m. Determine the damping constant 
necessary to limit the amplitude of the motion to a maximum of 2 m. 

(ii) A forcing function f(t) = 50coswt is imposed on a spring-mass 
system for which m= 4kg, k = 100 N/m, and c = 2 kg/s. Calculate the 
amplitude of the resulting motion for w = 4, w = 4.5, w = 5, and 
wo=6. 

(iii) Determine the input frequency w that gives the maximum amplitude 
for the spring-mass system in (ii) above. For this frequency, what is the 
maximum amplitude? 


4.7.3 The Cauchy-Euler equation 


The vast majority of our efforts with higher order DEs have involved 
linear equations with constant coefficients. The Cauchy—Euler equation is an 
important example of a linear, second-order DE whose coefficients are not 
constant. In particular, the Cauchy—Euler equation is a differential equation of 
form 

ty" + pty’ + qy =0 (4.7.16) 
where p and q are real constants and t > 0. 


(a) Explain why it is reasonable to guess that y(t) = t* is a solution 
to (4.7.16). Show by direct substitution in (4.7.16) that the guess y(t) = t* 
requires A to be a solution to the characteristic equation 


V+(p—lat+q=0 (4.7.17) 


(b) In the case where (4.7.17) has two distinct real roots 4; and Az, then the 
general solution to the Cauchy—Euler equation is 


y=at"+ot? 


324 Higher order differential equations 


Solve each of the following Cauchy—Euler initial-value problems: 
(i) ?y” —5ty’+8y=0, y(1)=1,y/(1)=0 
(ii) t?y”+9ty+12y=0, y(1)=1,y/(1)=0 
(c) When (4.7.17) has a repeated real root 41 = A2 = A, then we have 
only determined one linearly independent solution (y; = t*) of the 


Cauchy—Euler equation. Here we determine a second linearly independent 
solution. 


(i) Assuming that A is a repeated root of (4.7.17), show that 1 — p= 2d. 
(ii) Letting v(t) be an unknown function, consider the guess y2 = v- t*. By 
direct substitution in the Cauchy—Euler equation, show that v must 

satisfy the equation 


t*[t?v" + (24+ p)tv’ + (07+ (p—1)A+q)v] =0 (4.7.18) 


(iii) Use your work in (i) and (ii), as well as the fact that A satisfies the 
equation 
4+ (p—1)aA+q=0 
to show that y) = v- t* is a solution of the Cauchy—Euler equation in 
the case of a repeated root provided that 
tw’ +v'=0 (4.7.19) 


(iv) Show that v(t) = Int is a solution of (4.7.19) and hence state the 
general solution of the Cauchy—Euler equation in the case where the 
characteristic equation has a single real repeated root. 


(d) Solve each of the following Cauchy—Euler initial-value problems: 


(i) 2y’+7t +9y=0, y(1)=1, y/()=0 
(ii) ty” —9ty’ +25y=0, yO)=1, y(1)=0 


(e) When (4.7.17) has complex roots, say 41 = a+ bi and Az = a— bi, then we 
proceed with a corresponding complex solution to the Cauchy—Euler 
equation and verify that its real and imaginary parts are themselves real, 
linearly independent solutions to the equation. In particular, with 
A =a-+ bi, observe that 


z(t) = th = ttt bi a yagi 
By writing 
poi — gin(t”) _ pbilnt 
and applying Euler’s formula, show that 
z(t) = t*[cos(blnt) + isin(bln t)] (4.7.20) 


In addition, show by direct substitution that y(t) = t* cos(bIn f) is a 
solution to the Cauchy—Euler equation when a+ bi is a root of the 
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characteristic polynomial. Likewise, show that y2(t) = t’ sin(bln tf) is a 
solution. 

Hence, state the general solution to the Cauchy—Euler equation in the case 
where the characteristic polynomial has complex roots 4 = a+ bi. 


(f) Solve each of the following Cauchy—Euler initial-value problems: 


(i) t?y"+3ty’+5y=0, y(1)=1, y/(1)=0 
(ii) t?y” —3t/ +13y=0, y(1)=1, y'(1)=0 


4.7.4 Companion systems and companion matrices 


Given a second-order linear differential equation with constant coefficients 
such as 


y+ by’ + cy =0 (4.7.21) 


we know that through the substitution x; = y, x. = y’ we can convert (4.7.21) 
to the system of first-order equations given by 


(4.7.22) 


The system (4.7.22) is called the companion system of (4.7.21). In what follows, 
we explore the connections between the original equation and its companion 
system. 


(a) Consider the homogeneous linear second-order DE 
y" +3y' +2y =0 (4.7.23) 


Using the guess y = e”, find the characteristic equation of (4.7.23) and the 
values of r that make y = e” a solution of the given DE. 


(b) Convert the DE (4.7.23) into a system of first-order equations in the form 
x’ = Ax. In addition, determine the eigenvalues of the matrix A. 


(c) What do you observe about the roots of the characteristic equation in (a) 
and the eigenvalues of the matrix in (b)? Why is this result not surprising? 


(d) Find the general solution of the second-order equation (4.7.23) using 
standard methods from chapter 4. Find the general solution of the 
first-order system you found in (b) using standard methods from 
chapter 3. Explain how your two results agree. 


(e) Now consider the general equation (4.7.21) where b and c are arbitrary 
constants and its corresponding companion system. 


(i) Show that the roots of the characteristic equation are 


ms —b+VJb2 —4c 
= 
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and that the eigenvalues of the coefficient matrix of the companion 
system are 


eta b+ Vb? —4c 
- 2 
(ii) Assuming that b? — 4c > 0 so that the values of r in (i) are real and 
distinct, state the general solution of (4.7.21). 
(iii) Show that the eigenvectors of the matrix of the companion system that 
correspond to A; and Az are given by 


1 1 
i=, and 43, | 


where A, = (—b + V b? — 4c) /2 and A2 = (—b— Vb — 4c) /2. State 
the general solution to the companion system. 

(iv) Compare your result from (ii) to the result for x2 in (iii). Do your 
solutions agree? 


(f) Our work above shows that for any second-order differential equation, 
there exists a companion system of two first-order equations whose vector 
solution contains the solution of the second-order equation. 

For the third-order equation 


yl" + 2y" —y' -2y=0 


find the solution of the system directly by using standard methods from 
chapter 4. Then, find the general solution of the first-order companion 
system constructed from the substitution x; = y, x) = y’, x3 = y” using 
standard methods from chapter 3. Compare your results. 


(g) In both the direct solution of higher order linear differential equations and 
in the solution of systems of linear first-order equations, the solution 
methods require us to find roots of polynomials. Our work above enables 
us to see the fact that any polynomial has an associated matrix, a so-called 
companion matrix, whose eigenvalues are the same as the zeros of the 
polynomial. In general, given a polynomial function 


p(t) = t+ agit"! + ag_gt” ? +++ ayt t+ ap 


the companion matrix of p(t) is given by 


Cal 2... 
| 0 0 ee 0 0 1 
L 49 —A@ A, ++ —Gy-2 —An-1 | 
That is, C is an n x n matrix whose first n — 1 rows are all zero except for 
the entry just above the diagonal, whose value is 1. The final row consists 
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of the opposites of the coefficients of the constant, linear, etc., terms of the 
polynomial p. 

It can be proved that, in general, the eigenvalues of C are the same as the 
zeros of p(t). We verify this fact through a few examples. 


(i) For the polynomial p(t) = t* + 3t +2, determine the companion 
matrix C. Compute the eigenvalues of C directly and compare the 
result to the zeros of p(t). 

(ii) For the polynomial p(t) = t? + 347 + 3t + 1, determine the 
companion matrix C. Compute the eigenvalues of C directly and 
compare the result to the zeros of p(t). 

(iii) For the polynomial p(t) = t* — 1, determine the companion matrix C. 
Compute the eigenvalues of C directly and compare the result to the 
zeros of p(t). 


(h) For the nth-order linear homogeneous equation 
y + any") +--- ary’ + any =0 (4.7.25) 


show that the coefficient matrix of the corresponding companion system is 
in fact that companion matrix of the characteristic polynomial of (4.7.25). 


This page intentionally left blank 


Laplace transforms 


5.1 Motivating problems 


In this chapter, we again consider solving nonhomogeneous linear differential 
equations such as 


y'+ay' +ay=f(t) 


but in contexts where the forcing function is different from those we have 
previously encountered. While we have developed the methods of undetermined 
coefficients and variation of parameters to approach this problem, there are 
several reasons to consider a different means of solution. Perhaps, most 
prominent is that in every example to date, we have assumed that the function 
f(t) is continuous. Indeed, it has also typically been the case that f(t) is a 
standard function, one belonging to the library of basic functions like sin2t and 
Int that we encounter in calculus. In many applications, however, it is possible 
for f(t) to be piecewise defined, discontinuous, or worse. We consider two 
examples that demonstrate these possibilities. 

Electrical circuits with a voltage source provide a common situation where 
the forcing function f (ft) is not continuous. If we flip a switch to turn the voltage 
on, then the forcing function is actually a step function that leaps from zero toa 
constant value. Recall that the charge Q(t) in an RLC circuit is modeled by the 
second-order equation 


LQ! +RQ' + =Q= E(t) (5.1.1) 


where E(t) is an external voltage source. Suppose that we are given an RLC 
circuit with an initial charge Q(0) and initial current Q’(0), and that the voltage 
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E(t) = 1000 is turned on at t = 4. The voltage function E(t) is, therefore, defined 
piecewise by the formula 


ge 0, if0<t<4 
~ 11000, ift>4 


Let us further assume that L = 20 H, R = 40 Q, C = 107? F, and that 
Q(0) = 25 and Q'(0) = 0. From the given information and (5.1.1), we know 
that Q(t) is modeled by the initial-value problem 


20Q" + 40Q' + 100Q= E(t), Q(0)=25, Q/(0) =0 (5.1.2) 


We have not yet encountered means to deal with a step function as the forcing 
function in an initial-value problem. In section 5.4, we will discuss step functions 
in detail, learning how they may be used to turn other functions on and off; in 
addition, we will show how the Laplace transform provides an ideal tool for 
dealing with piecewise-defined functions in initial-value problems. With these 
tools, we will be able to determine the solution Q(t) for (5.1.2) whose graph 
is shown in figure 5.1. Observe that we see the expected damped oscillation in 
Q(t) up until time t = 4 when the forcing function E(t) is turned on, at which 
point we see the solution driven vertically away from zero so that as t increases, 
Q(t) > 10. That Q(t) approaches 10 should not surprise us since Q(t) = 10 is 
a constant solution to the equation 


20Q” + 40Q’ + 100Q = 1000 


In fact, Q(t) = 10 is a stable equilibrium solution of the equation. 

In addition to functions that get turned on or off at a certain time, another 
important forcing function to consider is a so-called impulse function. These 
functions are ones where a force is imparted over an extremely short time 
interval such as a hammer striking a mass. In section 5.4, we introduce the Dirac 
delta function, 5(t), study its properties, and see how it may be used in settings 
such as the following. 


Figure 5.1 The solution Q(t) to 
(5.1.2). 


Laplace transforms: getting started 331 


0.45 


0.24 


4 8 


Figure 5.2 The solution curve y(t) to 
(5.1.3). 


Suppose that a mass of 1 kg is attached to a spring with constant k = 4 and 
the system’s damping constant is c = 2. In addition, assume that the mass is 
initially displaced 0.5 m from equilibrium and released. At time t = 4, the mass 
is struck with a hammer imparting a unit impulse in the positive direction. The 
combination of all of these conditions leads to the initial-value problem 


y"+2y'+4y=6(t-4), y(0)=0.5, y’(0) =0 (5.1.3) 


where the function 5(t — 4) represents the hammer imparting the unit force of 
impulse. 

Just as with piecewise-defined functions, we will learn that the Laplace 
transform provides an ideal tool for dealing with impulses. Once we develop the 
appropriate theory, we will be able to solve initial-value problems such as (5.1.3) 
and see that the solution behaves as shown in figure 5.2. In the solution, we see 
the noticeable impact of the impulse as the problem appears to restart, almost 
as if new initial conditions have been given at time t = 4. 

In addition to being able to address discontinuous and impulse forcing 
functions, the Laplace transform is a powerful tool because it handles all 
allowable forcing functions in the same manner. Moreover, in each case it 
proceeds directly to the solution of initial-value problems without first finding 
the general solution to the differential equation. These ideas and more will be 
studied in subsequent sections. 


5.2 Laplace transforms: getting started 


The motivating idea behind the Laplace transform is natural: to solve a 
differential equation, our desire is to integrate. For the simplest examples, such 
as y’ = y, we know that we can separate variables and integrate in order to 
determine y. However, if we approach the problem 


y' + aoy =f (t) (5.2.1) 
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by attempting to integrate both sides from 0 to s with respect to t in order to 
eliminate y’, doing so leads to the equation 


[ roaster f yisyar= [fede (5.2.2) 


While {° oe (t) dt = y(s) — y(0) eliminates the derivative y’ from the equation, 
and fj f(t) dt can saan be computed for a given f, in (5.2.2) we are left with 
the expression {> y(t) dt, where y is an unknown function. Essentially this step 
of integrating has ie the derivative of the unknown function y with its 
integral in the equation we are endeavoring to solve. This leaves us no closer to 
finding the solution function y(t). 

Rather than simply trying to integrate, the Laplace transform uses a 
modified approach in which every function in (5.2.1) is multiplied by another 
function before integrating; this approach will enable us to convert differential 
equations in y(t) and y’(t) to algebraic equations in a new unknown function 
Y(s) that we can solve for Y(s). This method is similar to the use of integrating 
factors when solving linear first-order equations. 

Before we formally define the Laplace transform, we discuss a few 
preliminary ideas, some of which are familiar concepts from calculus. First, 
we assume throughout this chapter that all forcing functions are piecewise 
continuous functions defined for t > 0 and that 


f (0) =f(0T) = lim f(t) (5.2.3) 
t>0+ 
That is, f cannot be discontinuous at the origin itself, though it is allowed to 
have finitely many discontinuities for t > 0. 
Furthermore, we assume that the forcing function does not grow more 
rapidly than an exponential function. Formally, we will assume that f(t) is of 
exponential order, which means that for sufficiently large f, 


If(2)| < Me™ (5.2.4) 


for positive constants M and b. Functions that are piecewise continuous 
and meet conditions (5.2.3) and (5.2.4) are called acceptable. For example, 
polynomial functions, sin kt, e, and sums and products of these functions are 
acceptable, as are piecewise-defined functions with finitely many discontinuities 
whose pieces consist of these basic functions. In particular, linear combinations 
of acceptable functions are acceptable. Functions such as 


er pM. (t _ ijt 


are not acceptable. The first grows too rapidly to be of exponential order, 
the second fails to meet the condition (5.2.3) that a limit exists from the 
right at the origin, and the third is not piecewise continuous on any interval 
containing t = 1. 
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In addition, from calculus we recall the following important concepts: 
* If y’ = f(t) and y(0) =0, then y = fy f(s) ds. 


* The improper integral ie f(t) dt is said to converge whenever 


- 
exists. If this limit fails to exist, we say the improper integral diverges. 


* Given a function of two variables K(s, t), if we integrate this function with 
respect to t from t = a to t = b, the result is a function of s. That is, 


b 
i K(s, t) dt 


Recall our earlier note regarding the overall approach with Laplace 
transforms: in order to solve an initial-value problem, we integrate both sides 
of the differential equation after both sides have been multiplied by a more 
complicated function. The main idea is that we use the transformation given by 


is a function of s. 


[ && t)f (t) dt 
0 


Knowing the prominent role that the exponential function has played through- 
out our work with differential equations to date, it is not surprising that we 
choose to use K(s, t) = e~'. Specifically, we make the following definition. 


Definition 5.2.1 Let f(t) be an acceptable function defined on the interval 
[0, co). The Laplace transform of f(t), denoted L[f], is the function defined by 


Lif|= ic dt (5.25) 
0 


We note that because L[f ] is a function of s, we often write F(s) rather than the 
more explicit £[f (t)]. We consider an example to see the Laplace transform at 
work. 


Example 5.2.1 Compute the Laplace transform of f(t) = f. 


Solution. By definition, 


L[tl= [ te dt (5.2.6) 
0 
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Replacing the improper integral with a limit and integrating by parts, we observe 
that 


Lit] = im, [ te dt 
T>00 J 


= lim |—-[t+- Je 
| io a@, 0) Ss iS 0 
1 1 I 1 
= lim | (r+ Jer+2(0+2) e| 
T—>0o Ss Ss Ss s 
1 : 1 
= tin | a ae | (527) 
1c Ss Ss Ss 


By L’Hopital’s Rule,' we know that re~‘’ + 0 as r — oo for each s > 0. 
Combined with the fact that e~*” + 0 as r > on, it follows from (5.2.7) that 


l 
Lit]=F(s)=5 (5.2.8) 


Soon we will apply the Laplace transform in order to solve initial-value problems. 
This process will require us to also use the inverse Laplace transform which asks, 
“given a function F(s), what function f(t) is such that C[f(t)] = F(s)?” For 
instance, (5.2.8) tells us we may write 


gO EB =t (5.2.9) 
a= 2 


Much more on inverse transforms will follow as we progress in our study. 

It is not obvious that the Laplace transform of every acceptable function 
exists. While we omit the proof, it is possible to prove the following theorem by 
showing that not only does f(t) being acceptable guarantee that L[f (t)] = F(s) 
exists, but that F(s) is a function that must tend to 0 as s > oo. 


Theorem 5.2.1 If f(t) is acceptable, then the Laplace transform F(s) of f(t) 
exists. Moreover, 


1. sF(s) is bounded as s + oo, from which it follows that 


2. lim F(s) =0. 
SCO 


Although it is not necessary for a function to be acceptable in order to have a 
Laplace transform, our focus will be almost exclusively on acceptable functions. 
In addition, we note that not all elementary functions can be generated by 
taking the Laplace transform of an acceptable function. For instance, F(s) = 1 
cannot be the Laplace transform of an acceptable function since both parts of 
theorem 5.2.1 are contradicted. 


i ane, : 1 
lim — = lim — =0. 
t>00 eT roo ses 
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The next three examples further illustrate the definition and notational 
conventions we use with Laplace transforms. 


Example 5.2.2 Compute the Laplace transform of f(t) = 


Solution. From the definition, we observe that 


Lf] -|/ e “'dt= lim —-e~* res lim se" +s] a 
0 r>0o. s 


0 rT>oo s 


since e * > Oasr—> oo. 


Example 5.2.3 Find the Laplace transform of f(t) =e. 


Solution. We compute 


le. 2] CO sd 
Lle“] = / ete" dt = / e4—S)t dt — lim | e(4-5)* dt 
0 0 


TOO 0 


eae : elas) : = : 
0 reowla—s a—s s—a 


provided that s > a, for then e(*-5)" > 0 as r > 00. 


At times, we will need to restrict the values of s in order for the Laplace transform 
to exist. Above, we observed that L[e“] = 1/(s—a), provided that s > a. Usually, 
we will suppress the discussion of the restriction on s-values and simply assume 
that the domain of the Laplace transform is as large as possible. 


Example 5.2.4 Find £[coskt] and L[sin kt]. 
Solution. By definition, 
[oe 
Licos kt] = / cos kte~* dt 
0 


Integrating by parts twice or using a table of integrals, 


1 r 
ewe (k? sin kt — scoskt) e~* , 


gp 1 
tim, | 5 Je 4k d= 9| 
tim [$ e“ksinkr e—"scoskr 5 


s2 +k s2 + k2 aera) 


(5.2.10) 


Since e~*” + 0.as r > oo and |sinkr| and | cos kr| are bounded by 1 as r > ov, 
it follows from (5.2.10) that 


s 
L kt] = ——~ 
[cos kt] 248 
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Similar computations show 


L{sin kt] = Zak 
Table 5.1 
Laplace transforms of some basic functions 
F(t) F(s) = L[f(t)] = for F(t)e-* dt 
1 1/s 
t 1/s? 
2 2/33 
e* 1/(s—a) 
cos kt s/(s? +k?) 
sin kt k/(s? +k?) 


We close this section with table 5.1, which summarizes the Laplace 
transforms we have computed so far. 


Observe that each line in the table may also be written in inverse form. For 
example, £~![1/(s — a)] = e“’. This will be particularly useful in the next 
section as we see the first example of how the transform and its inverse can be 
used to solve an initial-value problem. In order to apply the Laplace transform 
successfully, we need to develop a deeper understanding of its properties 
and explore the impact of the transform on a wide range of functions. The 
following exercises and our investigations in the next section continue our work 
to this end. 


Exercises 5.2 In exercises 1-4, explain why the limit of each function g(r) is 
0 as r > oo. In each, assume s > 0. 


1. g(r) =re-* 
2. g(r) = r2e— 
3. g(r) =r"eF 


4. g(r) =e" sinkr 


In exercises 5—16, use the definition of the Laplace transform to compute L[f (t)]. 
For each, state the domain of s-values on which L[f (t)] = F(s) is defined. 


5. f(t) =2t 
6. f(t)=t—3 
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7. f(t)=2-t 

8. f(t) = 0? 

9. f(t) = t? —3 
10. f(t) = (t —2)? 
isiQjsSe 


IF, fijee 
13. f(t) = > 
14. f(t) =cos4t 
15. f(t) = te™ 
16. f(t) = tsin2t 


From examples 5.2.2 and 5.2.1, we know that 
1 1 
L[1]=- and L[t]= = 
s s 


Use these facts to compute the Laplace transform of each of the functions 
in exercises 17-19 with as little computation as possible. What properties of 
integrals and limits are being used? 


17. f@)=144 
18. f(t) =3t—2 
19. f(t)=c+kt 


20. Explain why the Laplace transform is a linear operator on the vector space 
of acceptable functions.” That is, explain why for any real numbers a and b 
and any acceptable functions f and g, 


Llaf(t) + bg(t)] = aL[f(t)] + bLig(4)] 


5.3 General properties of the Laplace transform 


In many ways, the Laplace transform resembles the differentiation and 
integration operators from calculus. For example, given a function f(t) = 
3¢4 + 5t +1, taking the derivative results in a new function f’(t). Using the 
alternate notation D[f] for the derivative of f with respect to t, we see that 


D[Bt* +5t+1)=120°+5 


2 See appendix D for further discussion on linear transformations of vector spaces. 
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In particular, the “D” operator transforms one function into another. Likewise, 
if we consider the definite integral of f(t) = t— 1 from t = 0 to t = x, we 
find that 


* 1 
[ e-nde= 5 -x 


Letting I(f) = I : f (t) dt, we see that I transforms one function f(t) into another 
function F(x) by the process of integration. In the same way, as we have seen in 
examples 5.2.1-5.2.4, the Laplace transform takes an acceptable function f(t) 
and transforms it into a new function F(s) bya process slightly more complicated 
than standard integration. 

From calculus and our preceding work with differential equations, we know 
that taking the derivative of a function is a linear process, as is calculating the 
definite integral. More specifically, for any constants a and b and functions f(t) 
and g(t) that are differentiable and integrable, we know that 


D{af(t) + bg(t)] = aD[f (t)] + bDig(t)] 


and 
[eros been dt = af firyae+ b fs) dt 
0 0 0 


Similarly, because the Laplace transform’s definition involves limits and 
integrals, it has the same properties of linearity as the derivative and integral 
operators. In particular, as was shown in exercise 20 of section 5.2, the following 
theorem holds. 


Theorem 5.3.1 For every pair of scalars a and b and acceptable functions f(t) 
and g(t), 


Liaf (t) + be(t)] = aL[f(t)] + bL[g(t)] (5.3.1) 


Theorem 5.3.1 shows that the Laplace transform, like the differential and 
integral operators, is a linear transformation or linear operator. Formally, a linear 
transformation is a function T that maps one vector space V to another vector 
space W where T satisfies the property that for all constants a and b and all 
elements u and v in V, T(au+ bv) = aT(u) + bT(v). Appendix D provides 
further discussion on linear transformations of vector spaces. 

In calculus, following the definitions of the derivative and the definite 
integral, we quickly discover more general properties that enable us to compute 
derivatives and integrals without using the definition directly. In the same 
way, while we have seen a few examples of how to use the definition to 
compute the Laplace transform of certain functions f(t), we can use results 
such as theorem 5.3.1 to more easily determine the Laplace transform of more 
complicated functions. Two examples follow. 


Example 5.3.1 Find the Laplace transform of f(t) = 7 — 3e”’. 
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Solution. We know from examples 5.2.2 and 5.2.3 that £[1] = 1/s and 
Lle?"] = 1/(s — 2). By theorem 5.3.1 it now follows that 


7 3 
A =e (S730 |S eS = 
s s—2 


We note that the individual Laplace transforms are defined on different domains: 
7/s is valid for s > 0 while 3/(s — 2) is defined if s > 2. We usually suppress 
discussion of this issue and assume that L[f (t)] is defined on the largest interval 
possible. In example 5.3.1, this domain is {s|s > 2}. 

Example 5.3.2. Find the Laplace transform of cosh kt and sinh kt. 


Solution. By definition, the hyperbolic cosine function is given by cosh kt = 
5ekt + ve, By the linearity of the Laplace transform, it follows that 


1 1 
Licosh kt] = sete" de sfle*] 


_ 1 1 4 1 7 Ss 
~2\s—-k stk) s?-k 
Similarly, 


1 1 1 1 k 
i hk = — kt _ —kt = — — 
L£[sinh kt] c| 56 e | (+ =z) 2k 


In addition to taking linear combinations of functions, we often want to multiply 
a given function by t or some power of t. For example, it is natural to wonder 
if we can use our work in preceding examples to compute L[te“']. If we first 
consider the Laplace transforms of the simple power functions 1, f, t*, and so 
on, we find evidence for a conjecture on how we might approach L[te”]. In 
particular, note that 

1 1 » 2 

s s s 
The last result was shown in exercise 8 of section 5.2. In fact, we could go on to 
show that L[t?] = 6 /s*. This sequence of results reminds us of derivatives: in 


particular, 
d/l 1 d/l 2 d|[2 = 6 (5.3.3) 
ds|s|  s* ds|s2| 33 ds|s3| — s4 ~ 


From this sequence of examples, it appears that each time we take a given 
function f(t) = t” and multiply it by t, the impact on its Laplace transform is 
that the transform of the new function is the opposite of the derivative of the 
transform of the original. Using a result from multivariable calculus known as 
Leibniz’s rule, a formal proof of this fact may be established, not only for power 
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functions, but also for all functions having Laplace transforms. We defer this 
work to exercise 25 and state the following theorem. 
Theorem 5.3.2 If £[f(t)] = F(s), then 

d 


Lltf(t)] = —F'(s) =~ F(s) (5.3.4) 


Theorem 5.3.2 enables us to expand on our observations above regarding the 


Laplace transforms of the power functions f, t?, t?, and so on. In particular, 
replacing F(s) with £[t], we can take the perspective that (5.3.4) implies 


d 
Llif (t= — 4, EU (A) (5.3.5) 


This shows that, for example, 


a Gv. A d[6] 24 
L[e|=Lit- r= qelt l= z[2|- 


In addition, a generalization of this reasoning can be used to show the following 
corollary to theorem 5.3.2. See exercise 26. 
Corollary 5.3.3 For each positive integer n, 


Lie" f (t)] = (—1)"F(s) (5.3.6) 


We next consider two examples that show how we can use recent results to 
compute the Laplace transform of familiar functions multiplied by ft. 


Example 5.3.3. Find £[te”’] and L[t7e“]. 


Solution. We know from earlier work that £[e*] = 1/(s— a). It follows from 
theorem 5.3.2 that 


ot dn» df 1 1 
Llte"| = ——Lle"] = — 2] - = 


Similarly, 


d d 1 2 
Lit at =——_fJte" — = 
ed ds ven tend (s—a)3 
In fact, as we will see in exercise 27, we can show in general that 


Lit"e | = im (5.3.7) 


Example 5.3.4 Find £[tsin kt]. 
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Solution. In example 5.2.4, we showed that 
L[sin kt] = ~—~ 
[sin kt] Fak 

Applying theorem 5.3.2, we know that 


' d k 2ks 
Alsi =~ 7) aye |~ +e 


As we have noted, we are motivated to develop the Laplace transform by the 
need to solve initial-value problems that involve unusual forcing functions. For 
example, we will soon work to solve equations of the form 


y' + ay = f(t) (5.3.8) 


where f (t) isa step function or other piecewise defined function. We will use our 
understanding of the Laplace transform to solve these equations by taking the 
Laplace transform of each side of (5.3.8) to transform the differential equation 
(in t) into an algebraic equation (in s). Our hope is that upon doing so, we can 
solve the new algebraic equation in order to ultimately solve the differential one. 

To see how this process begins, we take the Laplace transform of both sides 
of (5.3.8) and apply the linearity property. Doing so results in the equation 


Lly']+ ally] = LU (11 (5.3.9) 


Here, we realize that while we can compute CL[f(t)] using the definition or 
established results, it is unclear how to work with L[y’] and L[y]. Ideally, if 
we could understand how the Laplace transform L[y’] of the derivative of an 
unknown function is related to the Laplace transform £[y] of the function itself, 
that would enable us to work with one unknown quantity. To this end, we return 
to the definition and show how L[y’] depends on L[y]. 

Let us suppose that y and y’ are acceptable functions and that y is 
continuous. By definition, 


ciy(ol= f y'(t)e-“ dt = lim [vweta (5.3.10) 
0 T>00 Joy 


To evaluate {, y'(t)e—" dt, we use integration by parts with u = e~* and dv = 
y’'(t)dt. It follows that du = —se~“'dt and v = y(t). Integrating? (5.3.10), 


Lly'(t)] = lim y(t)e7* ts y(t)en* dt 
tT 0O 0 0 


= lim, yinne® = (0) +s [ y(t)e " dt (5.3.11) 
ta 0 


3 The integration by parts formula holds since y is continuous. If y has a jump discontinuity, then 
this part of the argument is more complicated. 
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Since y is an acceptable function, it is of exponential order and |y(t)| < Me?" for 
some positive constants M and b. Assuming that s > J, it follows y(r)e~*" > 0 
as r > oo. In addition, in (5.3.11) we observe 


r [oe 
lim | y(t)e* dt = sf y(t)e * dt = sL[y(t)] 
TOO 0 0 
by the definition of the Laplace transform. Hence, (5.3.11) implies 

Lly'(t)] = sLiy(t)] — y(0) (5.3.12) 
Our work has proved the following theorem. 


Theorem 5.3.4 Suppose y(t) is continuous and y(t) and y’(t) are acceptable. 
Then 


Liy'(t)] = sL[y(t)] — y(0) (5.3.13) 


Note particularly the appearance of y(0) in the conclusion of theorem 5.3.4. 
This foreshadows how we will use the Laplace transform to solve an initial- 
value problem directly without resorting to a general solution of the associated 
differential equation. To see further how we will use the Laplace transform, we 
consider the following example. 


Example 5.3.5 Use the Laplace transform to solve the initial-value problem 


y+y=e"', y(0)=0 (5.3.14) 


Solution. We begin by taking the Laplace transform of both sides of (5.3.14) 
to achieve 
Lly']+ Liyl = Lle~'] (5.3.15) 


From example 5.2.3, we know that L[e~'] = 1/(s + 1). Furthermore, we just 
established 


Liy'] = sLly] — y(0) (5.3.16) 

Combining (5.3.15), (5.3.16), and the given fact that y(0) = 0, we have 

1 
L L = 5.3.17 
sLl[y]+LIy] 1 ( ) 
Letting Y(s) = L[y], factoring, and solving for Y(s), 
1 

Y(s) = ———_~ 5.3.18 
(s) Ga? ( ) 


To solve the initial-value problem, it remains for us to determine the function 
y(t) whose Laplace transform is Y(s) = 1/(s+ 1)?. That is, we must find 
LOLY(s)]=L7 "1 /(s+1)*]. In example 5.3.3, we saw that £L[te“] = 1/(s— ay. 
In particular, 


Lite ‘|= orc 


1 _ —t 
(s+1)2 ar ae 
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From (5.3.18), it now follows that 

y(t) = te! 
This is precisely the solution we expect had we applied another method (such 
as using an integrating factor) to solve (5.3.14). 


Note particularly that our work in (5.3.14)—(5.3.17) converted the given initial- 
value problem (5.3.14) involving y’ to an algebraic equation (5.3.17) involving 
Lily] = Y(s). We then had to use the inverse Laplace transform in order to 
determine y(t). This process is typical for how the transform is used to solve 
IVPs; at this point, we largely need to gain experience with more complicated 
functions and situations in order to solve more advanced problems. 

We make note of one more result that relates the Laplace transform of a 
higher order derivative to the transform of the original function in order to help 
us solve higher order IVPs before proceeding to establish additional results on 
products of familiar functions and piecewise-defined functions in order to more 
fully understand the workings of the Laplace transform. 


Corollary 5.3.5 Suppose y(t) and y’(t) are continuous and y(t), y’(t), and 
y(t) are acceptable. Then 


Lily" (t)] = ’LLy()] — sy(0) — y’(0) (5.3.19) 


The proof of corollary 5.3.5 is straightforward by two applications of 
theorem 5.3.4; see exercise 28. 

In theorem 5.3.2, we computed the Laplace transform of tf (ft) in terms of 
the Laplace transform of f(t). In addition to multiplying by t (or powers of t), 
another function that arises frequently in the study of differential equations 
is e“. Hence we are naturally interested in how L[e“f(t)] is related to 
Lif (OI. 

Letting f(t) be an acceptable function and L[f(t)] = F(s), we have by 
definition that 


F(s)= [fies dt (5.3.20) 
0 


For the Laplace transform of e“f(t), we note that e“f(t) is an acceptable 
function and, by definition, 


eter pints [et fine ar= fo flee at (5.3.21) 
0 0 


From the right-hand sides of (5.3.20) and (5.3.21), we observe that the only 
difference is that s has been replaced by s— a. In particular, £[e” f (t)] = F(s—a), 
where L[f (t)] = F(s). We say that F(s) has been shifted by multiplying f(t) by 
e* and call the theorem we have just proved the first shifting property, which is 
stated as follows. 
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Theorem 5.3.6 (First Shifting Property). Let f(t) be acceptable and L[f(t)] = 
F(s). For any real value of a, 


Lie“ f(t)] = F(s — a) 


In the next example, we compute three Laplace transforms to show the 
straightforward application of theorem 5.3.6. 


Example 5.3.6 Find £L[e” coskt], L[e” sin kt], and L[e“ t?]. 


Solution. We have already established that 


s 
L[cos kt] = ~— 
[cos kt] ope 
so by the first shifting property, 
at s—a 
Lie cos kt] = G=a7 4k 
Similarly, from the fact that 
k 
L{[sin kt] = ~—~ 
[sin kt] 2a 
we observe 
Lie sin kt] = ee eee 
~ (s-art+h? 
Finally, 
2 
2) — 
LIP l=5 
and theorem 5.3.6 together imply 
L at 2 — 
[eve] = ae 


A summary of the results we established in this section follows in table 5.2. 


Exercises 5.3 In exercises 1—5, use the linearity property and the transforms 
derived in the examples to find the Laplace transform of the given function. 


1. f(t)=3-e! 


2. f(t) = 4cost+2sint 
3. f(t) = 3e7* — 3sin2t 
4. f(t) =2+4+5sin3t 

5. f(t) =4cos5t — 6e—7# 
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Table 5.2 
Summary of results on the Laplace Transform from section 5.3 
f(t) F(s) = LIF (t)| = [&° f(the-st dt 
af (t) + bg(t) aL[f(t)]+ bLIg(t)] 
if (t) —F(s) = —£LIf(t)] 
tf (t) (—1)"F)(s) 
Py) sLif (t)] — f (0) = sF(s) —f (0) 
f"(t) LIF (t)] — sf (0) — f"(0) = s?F(s) — sf (0) — f (0) 
ef (t) F(s—a) 


In exercises 6-11, use theorem 5.3.2 or corollary 5.3.3 and the transforms derived 
in the examples to find the Laplace transform of the given function. 


6. f(t) =3ie* 
7.f(t)= Pert 
8. f(t) = 3tcos4t 
9. f(t) =P sint 
10. f(t) = t? cost 


)= 
11. f(t) =4cos5t — 6e—7# 


In exercises 12— 17, use the first shifting property and the transforms derived in 
the examples to find the Laplace transform of the given function. 


12. f(t) = 3te*" 


in fwmsre 

14. f(t) = e~?" cos4t 
15. f(t) =e" sin2t 
16. f(t) = e* sinh2t 
17. f(t) = cosh2tsin3t 


In exercises 18-24, use established general properties and the transforms derived 
in the examples to find the Laplace transform of the given function. 


18. f(t) = 3te** — e?' cost 
19. f(t) = 4t?e7' +. Ze! sint 
20. f(t) = e~7# (4? +4t +5) 
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21. f(t) = (#2 —t)sint 

22. f(t) = t(cos4t — 2sin4t) 
23. f(H=te ‘sm2t 

24, f(t) = t?e7' sin2t 


25. In multivariable calculus, students may have encountered Leibniz’s rule, 
which allows differentiation across the integral sign. In particular, the rule 
states that under reasonable hypotheses on a function K(s, t), 


d t=b t=b a 
—_ K(s, t) dt = —[K(s, t)] dt 
ds t t Os 


=a 


Use real rule to explain why theorem 5.3.2 is true. In particular, show 
that if F(s) = L[f(t)], then —F’(s) = L[tf (t)] 


26. Using the rule established in theorem 5.3.2, show why corollary 5.3.3 is 
true. Specifically, show that if n is a positive integer, then 


Loe" f (t)] = (—1)"F (s) 
(Hint: Apply the theorem to L[t- t”~!f(t)] to show that 


Lie" f(t) = fete f(0) 


and then repeat this line of reasoning on the expression £L[t”'f(t)].) 
27. Use corollary 5.3.3 to show that 


n! 


Lee] = (s— q)ttl 


28. Apply theorem 5.3.4 twice to prove corollary 5.3.5. 


29. Express £ [ f (2)] in terms of L[f(t)] and the first three derivatives of 
f(t) at t = 0 by using theorem 5.3.4. 


30. We have established that C[e*] = 1/(s — a) for any real number a. 
Assume now that this formula holds for any complex number a = a + fi, 
and hence compute the Laplace transform 


Llel@tbt] 


Use Euler’s formula and properties of complex numbers to show that 


s—a 45 B 
i 
(=o) +B Gea) p? 
Explain how equating real and imaginary parts produces an alternate 
derivation for the Laplace transforms of e®! cos Bt and e”! sin Bt. 


Lle*' (cos Bt + isin Bt)] = 
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t 
ST  ————— 
a 
Figure 5.3 The translated unit step 

function u(t — a). 


5.4 Piecewise continuous functions 


In physical applications, we sometimes encounter step functions that represent 
some quantity being turned on or off, such as an electric switch. If a mass 
in a spring-mass system is struck with a hammer or a drug is delivered by 
muscle injection, impulse functions that involve forces acting over very short 
time periods play a key role. To help us address these and related situations, we 
study the application of the Laplace transform to two important functions—the 
Heaviside function and the Dirac delta function. 


5.4.1 The Heaviside function 


We define the Heaviside function, or unit step function, denoted u(t), to be the 
function that is 0 for all t < 0 and 1 for all t > 0. That is, 
0, ift<0O 
u(t) = 5.4.1 

(2) : ift>0 ( ) 
Often, we will make use of a step function that turns on at t = a, rather than 
t = 0. Thus we employ the translated unit step function, u(t — a), which by (5.4.1) 
is given by 


u(t—a)= * oe (5.4.2) 


l, ift >a 


A plot of the translated unit step function is given in figure 5.3. 


Step functions may be used to turn other functions on or off. For example, if we 
consider the function f(t) = (4— t)u(t — 4), we observe that since u(t — 4) = 0 
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for t <4 and u(t —4) = 1 for t > 4, it follows 


0, ift<4 
t)= 5.4.3 
FC) toe ift>4 ( ) 


From this perspective, we see that the function (4 — f) is off until t = 4, at which 
time it is turned on. 

To see how we can use step functions to turn another function both on and 
off at various times, we consider the function g(t) = u(t — a) — u(t — b), where 
a < b. This difference of translated unit step functions turns on for a < t <b 
and turns off when t > b. More specifically, for t < a, both u(t — a) and u(t — b) 
are zero, so g(t) = 0. Fora<t < b, u(t—a) = 1 and u(t — b) = 0, thus g(t) = 1. 
And finally, once t > b, both u(t — a) = 1 and u(t — b) = 1, so that g(t) = 0. 
This can be written equivalently as 


0, ift<a 
g(t)= 41, ifa<t<b (5.4.4) 
0, ift>b 


This property of the function u(t — a) — u(t — b) enables us to write a 
single formula for any piecewise-defined function that arises, rather than the 
traditional cases format where we stipulate the different formulas on different 
intervals, as in (5.4.4). The next example demonstrates the role of u(t — a) — 
u(t — b). 


Example 5.4.1 Define the following piecewise function using unit step 
functions. 

t, if0<t<2 

f= 42, if2<t<4 

0, otherwise 
Solution. We use the fact that the function u(t) — u(t — 2) is 1 in the interval 
0 < t <2 and 0 otherwise, and u(t — 2) — u(t —4) is 1 on2<t <4 and 0 
otherwise. Thus, we turn on t for 0 < t < 2 and turn on 2 for 2 < t < 4 by 
writing 

f(t) = tlu(t) — u(t — 2)] + 2[u(t — 2) — u(t —4)] 
= tu(t) + (2—t)u(t — 2) — 2u(t — 4) 

A plot of f (t) is shown in figure 5.4 


At this point, we should again not lose sight of our goal: we are interested in 
using Laplace transforms to solve initial-value problems such as 


y" +2y' +5y=u(t—2), y(0)=1, y/(0)=0 


where the forcing function is turned on at time t = 2. Since we will solve such 
equations by taking the Laplace transform of both sides, we must understand the 
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T —_— 
2 4 


Figure 5.4 The function f(t) in 
example 5.4.1. 


transform of basic step functions. In fact, since step functions will be used to turn 
other functions on and off, we are more generally interested in £[u(t — a)f (t)]. 
We return to the definition to explore this situation further. 

Because we will employ a change of variables in our work, we begin by 
using z as a different variable of integration than the usual t in the definition. 
Specifically, from the definition of the Laplace transform we have 


ctu(t—a)f(o= [ u(z—a)f(zje * az= | f(zje ” dz 


The second equality follows from the fact that u(z — a) = 0 for all z < a and 
u(z — a) = 1 for all z > a, which allows us to eliminate the presence of the unit 
step function. 

We now employ the substitution z = t+ a and note that t = z — a and 
dz = dt. From this and our work above, we see 


L[u(t — a)f(t)] =) f(zje ” dz 


z=r 


lim f(zje ” dz 


> 
t>00 J, 


II 


=a 
t=r—a 


lim fGewe a 


TOO t=0 


t=r—a 
= lim f(ttaje *e% dt (5.4.5) 
TCO t=0 
In (5.4.5), since e is constant with respect to t, we can remove it from the 
integral. Moreover, we can take the limit as r > 00 and note that (r — a) > oo 
as well. From this, we now have 


L[u(t — a)f(t)] = oe [fet a)je" dt 
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On the right, we observe that the Laplace transform of f (t+ a) has arisen, and 
therefore 


L[u(t — a)f (t)] =e “Lf (t+ a)] 


We call this result the second shifting property and state it formally in the next 
theorem. 


Theorem 5.4.1 (Second Shifting Property) Iff(t) has a Laplace transform, then 
Llu(t —a)f(nl=e “Lif (t+ a)] (5.4.6) 


When working with inverse transforms, we'll often use the equivalent formula- 
tions of this result that 


Llu(t—a)f(t—a)]=e “L[f(t)] or Lo fe F(s)] = u(t — a)f(t— a) 
(5.4.7) 


which come from replacing t with t — a in the argument of f. To see how the 
second shifting property works and gain more experience with the roles played 
by unit step functions, we consider several examples. 


Example 5.4.2 Determine the Laplace transform of the step function, u(t — 3). 
Solution. We can view u(t — 3) as the function u(t — 3)- 1. Since we know 


that £[1] = 1/s, by the second shifting property it follows that 


e735 


Llu(t — 3)] = Ll u(t —3)- lJ =e Lf] = 


More generally, we can show that for any a> 0, 


e# 


L[u(t — a)]= (5.4.8) 


Example 5.4.3. Determine the Laplace transform of f(t) = u(t — 3) a 
Solution. With f(t) = t, by the second shifting property we have 


Llu(t — 3) t7] = e SL[(t +3) 
=e Lit? + 6t+9] 


2 6 = 9 
—3s 
= (5+5+ *) 


Example 5.4.4 Determine the Laplace transform of f(t) = u(t — a) — u(t—b). 
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Solution. Because we know £[u(t — a)] = e ™/s, we can use the linearity of 
the Laplace transform to find 
1 1 1 
Cai 4) =a = b= —e © — 2 =e Be) 
s s s 


With our understanding of the Laplace transform of step functions and the 
second shifting property, we are now prepared to compute transforms of a wide 
range of step functions. 


Example 5.4.5 Find the Laplace transform of 
1, if0<t<1 


f= 4t, ifl<t<2 
2, if2<t 


Solution. We first use step functions to write f(t) with a single formula. Using 
u(t) — u(t — 1) to turn 1 on and off, and similar ideas for t and 2, we have 


F(t) = Iu(t) — u(t — 1)] + tlu(t — 1) — u(t — 2)] + 2u(t — 2) 
= u(t) + (t— lu(t— 1) +(2—-t)u(t —2) 


Using the linearity of the Laplace transform, the second shifting property, and 
familiar transforms, 


Lif (t)] = Llu(t)] + LE(t — u(t — 1)] + £2 — t)u(t — 2)] 


1 
=—+ e S£LU(t+1)—1]+e*L[2—(t+2)] 


1 
= _f¢ Lyte “L[-1] 


Ss 


Example 5.4.6 Find the Laplace transform of f(t), where f(t) is the piecewise 
linear function shown in the following graph. 


Solution. From the graph, we see that f has slope 1 on [0, 2) and slope —2 on 
[2, 3). Therefore, f can be defined piecewise by the rule 


t, if0<t<2 
f= (6-2t, if2<t<3 
0, if3<t 


Using step functions, we can write f according to the formula 
f(t) = tlu(t) — u(t — 2)] + (6 — 2¢)[u(t — 2) — u(t —3)] 
= tu(t) + (6 — 3t)u(t — 2) — (6—2t)u(t— 3) 


352 Laplace transforms 


Applying the second shifting property, linearity, and familiar transforms, we 
see that 


Lif (t)] = Li tu(t)] + LL(6 — 3t) u(t — 2)] — L[(6 — 2t) u(t — 3)] 
= Lit] + e~L[6 — 3(t +.2)] — e*L[6 — 2(t +3)] 
=([i+e" C37 =¢ “2[=24] 

1 3 


2 . 
—2s —3s 
=——-—e + —e 
s2 s? s2 


At this point, we have become familiar with piecewise-defined functions 
and how the Laplace transform may be applied to them. In the near future, we 
will be solving initial-value problems of the form 

y+2y=6-u(t—4), y(0)=1 
through the use of Laplace transforms. In order to assess our progress to date, 
we explore this approach briefly here. Taking the transform of both sides of the 
differential equation, 


6 —As 
sLiyl-142Ly1= 


Letting Y(s) = L[y] and solving for Y(s), it follows that 


e 4s 
Y(s)(s+2)=1+ 
SO : 
Yis\= — +6e* : (5.4.9) 
s+2 s(s +2) 


Here, it remains to determine the function y(t) whose Laplace transform is 
Y(s). That is, we must compute the inverse Laplace transform of the righthand 
side of (5.4.9). Doing so involves using the inverse perspective on the second 
shifting property, as well as some algebraic work with the quantity 1/s(s+ 2). 
We will pursue these and related ideas further in subsequent sections. 
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Next, however, we turn our attention to the study of impulse functions that 
can model phenomena such as the striking of a hammer. 


5.4.2 The Dirac delta function 


In physical situations where a large force is delivered over a very short time 
interval, unit step functions are no longer sufficient to model the forcing 
function. For example, if a hammer is used to strike a mass attached to a spring 
at a given time, it is not immediately clear how we should represent this forcing 
function. To address this situation, physicist Paul Dirac proposed what is today 
called the Dirac delta function, denoted 5(t). We seek to understand this function 
by first examining what happens when a force of constant magnitude acts over 
a smaller and smaller time interval. 

Suppose that a force F, of constant magnitude acts on an object over the 
time interval [a — h, a+ h], where a > 0. Assume that the force is zero otherwise. 
The impulse (or amount of push) of the force is defined by 


ath 
i Fi, dt (5.4.10) 
a—h 
If we want this constant force Fy, to deliver a one-unit impulse, it follows that 
R= 1 
oh 


More specifically, if we wish to view the delivered force F;, as being generated 
by a forcing function F;,(t), we can use the unit step function to express F;,(t) 
through the formula 


1 
Fi(t) = > [u(t — (a— h)) — ult — (a+ hy) (5.4.11) 
A plot of F,(t) for several different values of h is shown in figure 5.5; the 
vertical lines in each are technically not a part of the graph of F),(t), but are 


10 + 


Figure 5.5 The forcing function F(t) 
for h= 0.2, h=0.1, and h=0.05. 
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included to help contrast the different values of h. Note particularly that F),(t) 
satisfies the property that 


oe) 
/ F,(t) dt=1 (5.4.12) 
—cC 

and that as h — 0, the magnitude of the force grows without bound in order to 
maintain the same total amount of push being delivered. 

For an actual impulse, such as when a hammer strikes a mass, we want 
the force to act instantaneously at time t = a, where a > 0. This instantaneous 
impulse function is known as the Dirac delta function, denoted 5(t — a), and is 
determined by letting h — 0 in F;,(t). In particular, we note two key properties 
of 6(t — a): 


1. (ta) = lim F(t) = lim [u(t - GHiHe=GL 


I. f° 6(t— a) dt =1 


Property I is the definition of the Dirac delta function; Property H is a 
consequence of (5.4.12) and taking the limit as h > 0. 

A good way to think of 5(t — a) is as a function that is zero everywhere 
except at a, but infinite right at a. Actually, 6(t — a) is a limit of step functions 
that are nonzero over shorter and shorter intervals, but that always enclose an 
area of one unit, thus having spikes that grow in magnitude as the interval 
width shrinks. In situations such as a mass being struck with a hammer, we 
can now use the delta function to model the forcing function. For instance, 
if a hammer strikes the mass at t = 3, we can model the forcing function 
by f(t) =6(t— 3). 

In order to solve initial-value problems that involve the delta function, it 
will be essential to know the Laplace transform of £[5(t — a)]. To do so, we first 
apply the definition of the transform to the step function F;,(t). In particular, 
by familiar properties of the Laplace transform, 


AE O=Z | salar Coy ene mn] 


1 1 
= 5, flute — (a— Wy) — 5p elute —(a+h))] 
tft 1 
— — ( 2,-(a-h)s __ = ,—(ath)s 
~ 2h (- . Ss e ) 
— e* hs —hs 
~ 2hs (c e ) (5.4.13) 


4 Technically, the Dirac delta function is not a function, because it has the unusual property that it is 
zero everywhere but a, and infinite at t = a. Ultimately, the Laplace transform is what enables us to 
make sense of this function. 
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Since 5(t — a) is defined as the limit of F,(t) as h > 0, we naturally define 
the Laplace transform of 5(t — a) to be the limit of the Laplace transform of 
F;,(t) as h > 0. In particular, from (5.4.13), some algebraic rearrangement, and 


an application of L’Hopital’s Rule, we can state that 
as 


eon 
lim C[Fy(t)] = lim <— (el — e'*) 
h>0 h>0 2hs 
es ets _ ews 
= lim 
Ss hoo 2h 
es sels + se—hs 
— lim 
Ss h-0 2 
—as 
= é s—e7% 


We therefore define L[5(t — a)] = e~™. We close this section with an 
example that foreshadows the use of the delta function in a spring-mass system 
and the role of Laplace transforms in solving the corresponding IVP. 


Example 5.4.7 Consider a spring mass system where m= 1, k= 13, andc =4. 
Assume that the mass is initially displaced 1 m and released. Finally, assume that 
at t = 3, the mass is struck with a hammer in the positive direction. Set up and 
solve an initial-value problem that describes this situation. 


Solution. Using the delta function, the given problem is a standard damped 
harmonic oscillator equation with an impulse forcing function. In particular, 
the displacement y of the mass satisfies the initial-value problem 


y" +4y'+13y =8(t—-3), y(0)=1, y/(0)=0 (5.4.14) 


Before we solve the IVP, we can use our intuition as a guide: we expect the size of 
the oscillations of the mass to decrease in magnitude until t = 3, at which time 
we expect the problem to restart as the blow from the hammer will increase the 
displacement of the mass, from which oscillations should eventually decrease to 
zero. We begin to solve (5.4.14) by using the Laplace transform in order to see 
how far our method enables us to progress. 


Taking the Laplace transform of both sides of (5.4.14), 
Lly"|+4LLy'] + 13LLy] = £[8(t —3)] 
From corollary 5.3.5, it follows that 
s*LLy] — sy(0) — y'(0) + 4sLLy] — 4y(0) + 13L[y] = L[S(t — 3)] 


Using the conditions y(0) = 1 and y’(0) = 0, as well as the fact that £[5(t — 3)] = 
e—>5, we now have 


Ly] —s+4slL[y]—44+13L[y] = as 
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0.55 


t 


2 4 6 


Figure 5.6 The solution to the IVP 
(5.4.14). 


Solving for L[y] = Y(s), we see that 
Y(s)(s?+4s+13)=st+4+e 5 

or 

_ s+4 ‘, e* 

© st+4s+13 9 s2+4s413 

It remains for us to learn how to compute the inverse Laplace transform 

of (5.4.15) in order to find the solution y to the IVP. The following sections 


are devoted to these ideas. Upon further study, we will be able to show that the 
function y(t) that satisfies (5.4.15) is 


Y(s) (5.4.15) 


1 1 
y= 3° (3cos3t + 2sin31) + 3 lt— 3)e21'-9) sin 3(¢ = 3) 


A plot of this solution is shown in figure 5.6, where y(t) demonstrates precisely 
the type of behavior we expect. 


The Laplace transform helps us make sense of the Dirac delta function in several 
ways. One is that we can imagine wanting to say that a hammer strikes a mass 
with different intensities. If, say, we want to compare the results of the initial- 
value problems where a hammer strikes a mass to deliver a given impulse versus 
what happens when the hammer strikes the mass three times as hard, this at 
first seems to be nonsense: 5(t — 3) and 36(t — 3) are both zero everywhere 
and infinite at t = 3. But the power of the Laplace transform rescues us again. 
Since by linearity, £[35(t — 3)] = 3£[5(t — 3)] = 3e7°s, the transform detects 
the difference in the amount of push delivered by the hammer strike, and the 
results are shown accordingly in the solution to the initial-value problem. In 
addition, since £[6(t — a)] = e~®, we know that the presence of e~® in Y(s) 
will lead to the presence of u(t — a) in y(t): here we see how the delta function 
leads to a restart at t = a as the function u(t — a) turns on at this time in the 
function y(t). 
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5.4.3 The Heaviside and Dirac functions in Maple 


Both the Heaviside and Dirac functions belong to Maple’s library of basic func- 
tions. The syntax for the Heaviside function is simply > Heaviside(t) ;. 
Similarly, the Dirac function is given by > Dirac(t);. 

For work with the Heaviside function, we often denote the function by u(t). 
In Maple, this can be accomplished with the command 


> u := t -> Heaviside(t); 


Then, to enter and plot a piecewise-defined function such as 
f(t) = t(u(t) — u(t — 2)) + (6 — 2t)(u(t — 2) — u(t — 3)) 
we may use the syntax 
> £:= t -> t*(u(t)-u(t-2)) + (6-2*t)*(u(t-2)-u(t-3)); 
> plot(£(t), t=-1..5, color=black, thickness=2) ; 
to generate the plot shown in figure 5.7. 


More on both the Heaviside function and the Dirac function in Maple, 
particularly related to their roles in solving initial-value problems with Laplace 
transforms, can be found in section 5.6.1. 


aie! 


2 4 


Figure 5.7 The function f(t) = 
t(u(t) — u(t — 2)) + (6 — 2ft) 
(u(t — 2) — u(t — 3)). 


Exercises 5.4 In exercises 1-7, sketch a graph of each of the following 
functions and write each in terms of unit step functions. 


0, if0<t<1 
Lf()=41, ifl<t<2 
0, if2<t 
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1, if0<t<4 
2 i= = 
FH) " if4<t 
0, if0<t<1 
3.f(t)= ft, ifl<t<2 
t?, if2<t 
t, if0<t<2 
4 t= i 
f(t) fi if2<t 
sint, if0<t<27 
5.f(t)= — 
F(t) (; if27 <t 
sint, if0<t<2z 
6. f(t)= = 
F(t) ee if2a7 <t 
t; if0<t<2 
7. f(t)= 42, if2<t<4 
4-t, if4<t 


8. Determine the Laplace Transform of the function f(t) given in 


(a) Exercise 1 


) 
c) 
d) Exercise 4 
) Exercise 5 
(f) Exercise 6 
(g) Exercise 7 


In exercises 9-11, compute the Laplace transform of f (ft). 
9. f(t) = 2[u(t — 1) — u(t — 3)] +.8(t —5) 

10. f(t) = 2sin5t+6(t—3) 

11. f(t) = 2e7* sin 2t + 5(t — 8) 


12. Set up, but do not solve, an initial-value problem that represents a 
spring-mass system with m = 4 kg, spring constant k = 10, and damping 
constant c = 2, where a unit impulse is delivered by a hammer at t = 6. 
Assume the units on all quantities are consistent and that the mass is 
initially displaced 0.25 m and released. 


13. Set up, but do not solve, an initial-value problem that represents a 
spring-mass system with m = 4 kg, spring constant k = 10, and damping 
constant c = 2, where a forcing function f(t) = 3sin2t is turned on at 
t = 4and an impulse of magnitude 4 is delivered by a hammer at t = 10. 
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Assume the units on all quantities are consistent and that the mass is 
initially displaced 0.25 m and released. 


5.5 Solving IVPs with the Laplace transform 


As we have seen in examples 5.3.5 and 5.4.7, in order to solve initial-value 
problems using the Laplace transform, the final step in the process is to answer 
the question “what function y(t) has Laplace transform Y(s)?” In this section, we 
will further study the inverse Laplace transform, the process that takes the Laplace 
transform of an unknown function back to the function itself. Throughout, we 
motivate our work through examples of solving initial-value problems to see 
some of the typical functions Y(s) that arise in this approach and the steps 
necessary to determine y(t) = L~![Y(s)]. 


Example 5.5.1 Use Laplace transforms to solve the initial-value problem 
y-—2y=5, y(0)=4 


Solution. We begin by taking the Laplace transform of both sides of the 
differential equation. Using the linearity of the transform, 


Liy']— 2L[y] = 5£[1] 


By theorem 5.3.4 and the familiar transform of the function f(t) = 1, it 
follows that 


5 
sLly] — y(0) — 2L[y] = r 


Using the given fact that y(0) = 4 and denoting L[y] = Y(s), 


s¥(s)-2¥(s) =4+~ (5.5.1) 


Note particularly that (5.5.1) is now an algebraic equation in the unknown 
function Y(s). Solving for Y(s), we find 


4s+5 
o) => 
s(s—2) 


At this point, we recall that Y(s) = L[y], where y(t) is the original unknown 
function we seek as the solution to the stated IVP. Solving the IVP has now been 
reduced to finding the function y(t) that has Laplace transform Y(s). That is, 
we seek y(t) = L7![Y(s)]. 
With a bit of algebraic rearrangement and insight, we can find the function 
y(t). In particular, using a partial fraction decomposition, we can show that 
4s +5 5/2 13/2 


0 rs ta (5.5.2) 
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Recalling that £[1] = 1/s and L[e?"] = 1/(s —2), (5.5.2) implies 


13 
ae 2 


5 
t)=-- 
y(t) sos 


This is precisely the solution we would find to the IVP were we to use an 
integrating factor or separation of variables to solve the differential equation. 


Whenever we use the Laplace transform to solve an IVP, we will employ a process 
similar to our work in example 5.5.1: 


(1) Take the transform of both sides of the stated differential equation to 
transform the differential equation in y(t) into an algebraic equation in 
Y(s) = Ly]; 


(2) Use algebra to solve for Y(s); 
(3) Determine which function y(t) has the Laplace transform Y(s). 


As we have noted previously, given a function F(s), a function f (tf) such that 
LUf (t)] = F(s) is called the inverse Laplace transform of F. We use the notation 
L~'[F(s)] = f(t). For our purposes, a good way to view the operator £7! is as 
one that reverses the work of the Laplace transform. 

A key step in working backward will be to decompose the function F(s) 
into more manageable pieces, often through a partial fraction decomposition. 
A review of partial fractions can be found in appendix A; partial fractions are 
an algebraic technique that proves useful for more than just integration, as we 
will see throughout this section. Once the pieces of F(s) are in a recognizable 
form, we use standard rules we have developed for Laplace transforms to 
compute the inverse transform. For example, after using partial fractions to 
decompose Y(s) in example 5.5.1, we showed that since L[e*'] = 1/(s — 2), it 
follows that 


More generally, we can state that 


al d ie (5.5.3) 


s—a 


Indeed, we realize that we can turn around any known relationship generated by 
the Laplace transform in order to make a statement about the inverse transform. 
For example, the inverse transform satisfies the linearity property stated in the 
following theorem. 


Theorem 5.5.1 For every pair of constants a and b, 


Lo [aF(s) + bG(s)] = aL! [F(s)] + b£7'[G(s)] 
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Both shifting properties we have developed are regularly used in their inverse 
form. For the first shifting property, given C[f(t)] = F(s), we know that for any 
real value of a, 


Lie f (t)] = F(s—a) 
Stated differently, this first shifting property implies 
L-|[F(s—a)] =e“ f(t) (5.5.4) 


Likewise, from the slightly revised version of the second shifting property, we 
know that 


Llu(t —a)f(t—a))=e “Lf (t)] =e “F(s) 

and therefore stated in inverse form, 

Lo |[e~* F(s)] = u(t — a)f (t—a) (5.5.5) 
In our next example, we see how several of these fundamental concepts are 
employed in practice, specifically when step functions are involved. 
Example 5.5.2 Use Laplace transforms to solve the initial-value problem 

yty=5u(t—1), y(0)=4 
Solution. Taking the Laplace transform of both sides of the differential 
equation and applying the initial condition, 
sLly] —44+ Lly] = 5£[ u(t — 1)] 

Using the established fact that £[u(t — 1)] = e~*/s and letting Y(s) = L(y), 


—s 


sY(s)—4+ Y(s)= 2 


Solving for Y(s), 
4 1 
Y(s) = —~—+5e°* 5.5.6 
‘s) 1" : s(s+1) ( ) 
At this point, we need to use the inverse transform to solve for y(t). Finding 
L|[4/(s+ 1)] is straightforward: by linearity and the first shifting property, 


| Sa Ea =4e' (5.5.7) 


To deal with the remaining term in (5.5.6), we note that with e~* present we 
will need to use the second shifting property (5.5.5) in reverse. For this, it will 
be most useful to have the function 


> We know £7![1/s] = 1, and thus the first shifting property implies Lf1/(s+ 1] =e*-1 
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Figure 5.8 The solution to the IVP 
of example 5.5.2. 


in a simpler form. Using its partial fraction decomposition, we observe that 


By (5.5.5), it now follows that 


Aiea ot = - — tat} 
Zz E (; sz) |= out me et ) (5.5.8) 


Combining our work at (5.5.7) and (5.5.8) to determine y(t) from (5.5.6), we 
have shown that 


y(t) = 4e7* + 5u(t —1) —5u(t — Le FD 


A plot of this solution curve is shown in figure 5.8, where we see qualitative 
behavior consistent with what we would expect from the forcing function in the 
IVP. In particular, the forcing function is 5u(t — 1), which makes the forcing 
function behave as if the constant function 5 is turned on at t = 1 in the initial- 
value problem. For t = 0 to t = 1, we see the standard exponential decay that 
we would expect for the homogeneous equation y’ + y = 0. But at t = 1, the 
solution function turns and begins to approach the equilibrium solution y = 5 
that we expect in the nonhomogeneous equation y’ + y = 5. We note specifically 
that the Laplace transform has successfully handled all of this at once, including 
the role of the initial condition y(0) = 4 and the corner in the solution function 
y(t) att=1. 


We next solve a second-order initial-value problem that involves the unit step 
function. Here, we will see how the higher order of the equation introduces 
additional complexity in determining the inverse Laplace transform needed to 
solve the IVP. 
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Example 5.5.3. Use the Laplace transform to solve the initial-value problem 
y"+2y +5y=ult—2), (0) =1, y/(0)=0 (5.5.9) 


Solution. Taking the Laplace transform of both sides of (5.5.9) and writing 
Y(s) = Liy(t)], we observe that 
—2s 


2 Y(s) — sy(0) — y'(0) + 2(sY(s) — y(0)) + 5¥(s) = 


Substituting the given initial conditions and factoring on the left, we have 


e725 


Y(s)(s*+2s+5)=s+2+ 
Ss 


Solving for Y(s), we can write 

2 i 
oe + a 
s?+2s+5 s(s?-+2s+5) 


It remains for us to determine the function y(t) whose transform is Y(s). By 
linearity, it helps for us to break the function Y(s) into the simplest pieces 
we can; we begin by determining the inverse transform of Y\(s). Because of 
shifting properties of the transform (and because of the fact that we cannot 
factor s? + 2s+5 in an effort to apply partial fractions), it is useful to complete 
the square in expressions such as s* + 2s +5. We instead write (s + 1)* +4, and 
seek to identify other parts of the expression that involve (s+ 1). Separating the 
numerator (s+ 2) into (s+ 1) + 1, we can express the first term in (5.5.10) as 


s+2 s+l 1 
Y;(s)= = 5.5.11 
ils) PLES GHD EA GEIS ( ) 


Recalling that L[cos2t] = s/(s* +4) and L[sin2t] = 2/(s? +4), we know 
wt? +4)]=cos2t and Loe +4)] =sin2t 


Y(s) = Yi(s) + Yo(s)= (5.5.10) 


The inverse of the first shifting property, £~'[F(s + 1)] = e~‘f(t), now implies 
that 


i a. : tesa at ant (5.5.12) 
=e ‘cos —e sin 5: 
(s+1)?+4 (s+1)?4+4 2 
Hence, the first term Y;(s) in (5.5.10) comes from taking the Laplace transform 
of the function y;(t) = e~' cos2t + se ‘sin 2t. 
From (5.5.10), it remains for us to find the function y2(t) whose Laplace 


transform is 
1 


s(s? +25+5) 


Using a partial fraction decomposition on the rational part of the function, 
we have 


Y>(s) = e775 


eel Lf. eee 
e- =-e — 
s(s*+2s+5) 5 s s*+2s+5 
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Figure 5.9 The solution y(t) to the IVP 
in example 5.5.3. 


Observe that we have already determined the inverse transform of the 
function (s + 2)/(s? +2s+5) above at (5.5.12). Here, we must deal with 
the additional presence of the constant 1/5, the multiplier e~*’, and the basic 
function 1/s. Recalling the inverse second shifting property, £~![e~“*F(s)] = 
u(t — a)f(t — a), and (5.5.12), we observe that 


Lo} (2 S+2 ) 
s  s?+2s+5 


=u(e—2)| te cos24e 2)+ 5 sin2(¢ 2] (5.5.13) 


Combining (5.5.10), (5.5.12), and (5.5.13), we have shown that the solution 
y(t) to the initial-value problem is 
—t 


e 1 
y(t) =e" cos2t+ 3 eer 5 u(t — 2) 
e7 (t-2) 


: igri) goad (F = 3) sin2(t >| 

A plot of the function y(t) is shown in figure 5.9. Here, we see evidence of 
the qualitative behavior we expect: until the unit step function turns on, the 
homogeneous equation should show damped oscillations so that y(t) > 0. 
But once the step function turns on, the forcing function makes the equation 
nonhomogeneous with a constant forcing function, making y = 1/5 the stable 
equilibrium solution to which y(t) tends. 


To further explore the ideas that arise in computing inverse transforms, we next 
consider a slight modification of the preceding example, but in an applied setting 
where a more complicated forcing function is present. In particular, we examine 
a spring-mass system in which a periodic forcing function is introduced at t= 7. 
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Example 5.5.4 Consider a mass of 1kg attached to a spring with spring 
constant k = 13 such that the system has damping constant c = 4. Assume that 
the mass is displaced 1 m from equilibrium and released at t = 0; furthermore, 
at time t = the forcing function f (t) = 2 sin 3t is applied. Assuming consistent 
units, set up an IVP that models this situation and solve the IVP using Laplace 
transforms. 


Solution. From our work with spring-mass systems, we know that the 
displacement y(t) of the mass from equilibrium must satisfy the initial-value 
problem 


y" + 4y' + 13y =2u(t—)sin3t, y(0)=1, y'(0)=0 
Taking Laplace transforms, it follows that 
s*Y(s)— sy(0) — y’/(0) + 4(sY(s) — y(0)) + 13Y(s) = 2£[u(t — 7) sin 3t] 


(5.5.14) 
We know that L[sin 3¢] = 3/(s? + 9), and by the second shifting property 
L[u(t — 7) sin3t] =e **L[sin3(t+7)] (5.5.15) 
At this point, we observe by basic trigonometry that sin(3t + 37) = 
sin 3t cos3z + cos3tsin32 = —sin3t. Hence, from (5.5.15) we have 
3 
L[u(t — 2) sin3t] = e**L[—sin3t] =—-e™* 
[u(t — sm) sin3t] =e [— sin 3t] e 249 
Returning to (5.5.14) and using the given initial conditions, it follows that 
3 
2 —ISs 
Y(s)— 4sY(s) —4+13Y(s) = —2 
s°Y(s)—s+4sY(s)-—4+ (s) e 25 
Factoring, 
Y(s)(s?-+4s +13) =s+4—2e7$ 
(s)(s° +4s +13) =s+ ener: 
Solving for Y(s), 
Y¥(s) = Yi(s) + Ya(s) 
4 3 
= ms (5.5.16) 


~Pyast3s HO) 45413) 
It remains to find the inverse transform of Y(s); we do so one piece at a 
time using the linearity of the inverse transform. In both Yj(s) and Y(s), 
we will algebraically rearrange the expression in order to help us more easily 
determine the inverse Laplace transform, using an approach similar to our work 
in example 5.5.3. 
Taking the first term in (5.5.16), we observe that since the denominator 
does not factor, we need to write it in a more familiar form. Completing the 
square and separating the numerator enables us to write 


s+4 s+2 2 


Yi(s) = Gi 49 (+2749" (s+ 2)*+-9 
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and see the structure of Laplace transforms of basic functions. In particular, 
from the first shifting property and the known Laplace transforms of cos 3¢ and 
sin 3t, it follows that 

s+2 di 2 
(s+2)?+9 (s+2)/?+9 


2 
cm(=£"| =e“ cos3t+ 56 sin3t 
(5.5.17) 


Next we find the inverse transform of the term Y2(s) in (5.5.16). That is, we 
must determine 


1 
L'[Y2(s)]=£7'| -6e-7° 5.5.18 
[¥2(s)] wae oe) 
From the presence of e~**, we know the second shifting property will be 
used; in addition, we must algebraically rearrange the remaining part of the 
expression in order to find the inverse transform. Computing the partial fraction 
decomposition of the rational function in (5.5.18), we equivalently seek 


6 s—l s+3 
Lo Ya(s)] = 27! ae 5.5.19 
[¥2(s)] Ee (5 a3) one 


One additional rearrangement will enable us to find the desired inverse 
transform. Completing the square in the second fraction and separating the 
numerator in each enables us to rewrite (5.5.19) as 


6 s 1 s+2 1 
fo y: = ey =ICS: = _ _ 
[¥o(s)] = go E (a5 249 (s+2)2+9 <a) 
Applying the inverse of the second shifting property to each of the terms in 
Lo'[Yo(s)], it follows that 


cals) = ult —m) | cos3(e I) 5sin3(t ) 


1 
ag“) eost(f-n)— oe sin 3(t — n)| (5.5.20) 


Noting that sin(3t — 37) = —sin3t and cos(3t — 37) = —cos3t, we can 
simplify (5.5.20) to 


3 1 1 
LL Yo(s)] = 59 HE ®) |- cos3t + Pad — eo 2(F-™)(_ cog 3t + 5in30| 


Combining our work with £~![Y;(s)] and £~![Y¥(s)], we have therefore 
shown that y(t) = £~![Y(s)] is the function 


2 3 
y(t) = e~7* (cos3t+ z sin3t) + oo mt )[—cos3t 


1 1 
+ 3 inst — eo"). cassis ; sin3t)] 


Solving IVPs with the Laplace transform 367 


Figure 5.10 The solution to the IVP in 
example 5.5.4. 


A plot of the function y(t) is given in figure 5.10, where we see that until 
the forcing function activates at tf = 7, we see the standard damped oscillations 
decaying to zero. When the periodic forcing function turns on, the system 
demonstrates the repeating oscillations generated by this function. 


At this point in our work, we have been exposed to most of the main ideas 
necessary for using the Laplace transform to solve initial-value problems. In 
addition to knowing the standard properties of the transform and its effects on 
basic functions, we must understand how to compute the inverse transform and 
the algebraic rearrangements that such inversion entails. Specifically, we have 
seen in several examples the need to determine partial fraction decompositions, 
complete the square, and separate the numerator in fractions. For example, the 
key computations necessary to find the inverse transform of the function 
()= 
s(s*+6s+11) 
are to first determine the partial fraction decomposition and write 
1 S+6 
s s*+6s+11 
The first term is straightforward to invert; but the second term requires further 
manipulation. Completing the square in the denominator, we see that s? + 
6s+11=(s+3)* +2, and therefore it is convenient to write the numerator as 
s+6=(s+3)+3. Doing so, 
F(s)= ee a 3 
$ (64342 (s4+37 42 
It is at this point, together with the first shifting property, that we can finally 
compute £~![F(s)] and find 


f(t) =L[F(s)] = 1 -— e*! cos V2t — eosin 


F(s)= 
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Finally, we have also seen that the second shifting property also plays an 
important role. In the presence of the unit step function u(t — a), the multiplier 
e * will arise in F(s). In that case, we must invert e “F(s); doing so, we get 
u(t — a)f(t — a), as opposed to simply f(t). 

In light of these overall comments, we see the need to practice the 
computation of inverse Laplace transforms so that we can use these concepts in 
the solution of initial-value problems. In the next section, we will summarize 
key properties of the inverse transform, consider a few additional examples of 
more complicated inverse transforms, demonstrate the role technology plays in 
computations, and provide exercises for additional practice. 

We close the current section with an example involving the Dirac delta 
function. 


Example 5.5.5 Consider an undamped spring-mass system with spring 
constant c = 4. Suppose that the mass is displaced 1 unit from equilibrium 
and struck with a force to impart an initial velocity of y’(0) = 1. In addition, at 
times tf = 7 and t = 20, a hammer delivers a one-unit impulse to the mass in 
the positive direction. Assuming consistent units, set up and solve an IVP that 
models this situation. 


Solution. We use the Dirac delta function to represent the impulse forces 
delivered at times t = 7 and t = 20. Coupled with the standard equation to 
represent the spring-mass system, we see that the displacement y(t) of the mass 
at time t satisfies the initial-value problem 
y" +4y =45(t-7)+4(t—20),  y(0)=1, y(0)=1 
To solve the IVP, we begin by taking Laplace transforms and find that 

s°Y(s) — sy(0) — y'(0) + 4¥(s) = L[8(t — 7)] + £[8(t — 20)] 
Recalling that £[5(t — a)] = e~® and using the given initial conditions, Y(s) 
must satisfy the equation 

SV¥ij=s-Th4Yghae Cte 
Factoring, 
Y(s)(s*+4)=stl+e %+e 2% 

and therefore 

1 1 
= a, p78 —20s 
~ S244 rw i er Wd +4 
Using the second shifting property to find the inverse of the last two terms on 
the right, we find 


Vishal 


Y(s) 


+ 


1 1 1 
=cos2t+ seer 5 ult — 7) sin 2(¢ — 7)+ 3 H(t — 20) sin 2(¢ — 20) 


A plot of the solution function y(t) is shown in figure 5.11. We know that 
because the system is undamped, once it is set in motion it will oscillate at the 


Solving IVPs with the Laplace transform 369 


30 


-l- 


Figure 5.11 The solution to the IVP of 
example 5.5.5. 


same amplitude indefinitely in the absence of other forces. When the hammer 
blows are delivered at t = 7 and t = 20, this will obviously change the amplitude 
of oscillation. At first the observed behavior may seem counterintuitive, as the 
hammer strikes are diminishing the amount of oscillation. However, if we note 
that the impulses are delivered in the positive direction at a time when the 
mass is traveling in the negative direction, then, indeed, the resulting solution 
accurately models the physical situation. 


It is interesting to explore how delivering the impulses at other times impacts 
the system. Note that our work with Laplace transforms in example 5.5.5 is 
essentially unchanged by the times the impulses occur. In particular, if the 
hammer strikes occur at t = a and t = J, then the solution will be 


1 1 1 
y(t) = cos2t+ 3 inet 5 ult —a)sin2(t—a)+ 5 ult b) sin2(t — b) 


If we choose a = 9 and b = 18, we see substantially different behavior in the 
solution function due to the fact that these impulses occur in the same direction 
as the motion at the time they are delivered. A plot of the solution y(t) in this 
case is shown in figure 5.12. 


Exercises 5.5 In exercises 1—20, solve the stated initial-value problem using 
Laplace transforms. In each case, sketch a plot of your solution. 

lL. y’+5y=20, y(0)=3 

2.y7'+3y=e7', y(0)=—2 

3.y'—-2y=e7#, y(0)=1 

4.y'+4y=sin3t, y(0)=5 

5.y+y=te', y(0)=-1 
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Figure 5.12 The solution to the IVP of 
example 5.5.5 where the impulses instead 
occur att =9 and t= 18. 


6.y—8y=u(t—1), y(0)=—4 
7.y'—8y=u(t—3)-t, y(0)=—4 
8. y/-—8y=<d(t—-1), y(0)=—4 

9. y”+9y=0, y(0)=0, y'(0)= 


ll. y’+9y=2, y(0)= 
12. y”+9y=5cost, y(0)=0, y'(0)=0 

13. y’+9y=5cos3t, y(0)=0, y/(0)=0 
14. y"+7y'+12y=0, y(0)=0, y’/(0)=3 


15. y’+6y'+9y=0, y(0)=2, y/(0)=0 

16. y’+2y'+y=3t, y(0)=0, y/(0)=0 

17. y"+2y +5y=u(t—4), yO)=1, y(0)= 
18. y” —2y' —3y=u(t—3), y(0)=2, y/(0)= 
19. y’ —2y’-3y=u(t—3), y(0)=2, y'(0)=0 
20. y” +2y’+5y=4(t—-1), y(0)=0, y’(0)=0 


For exercises 21-26, solve the stated initial-value problem from exercises 1-20 
by standard means developed in preceding chapters (i.e., without using Laplace 
transforms). 


21. y'+3y=e7f, y(0)=—2 
22. y'+4y=sin3t, y(0)=5 
23.y’/+y=te', y(0)=-1 
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24. y’+9y=2, y(0)=0, y'(0)=1 
25. y"+9y=5cos3t, y(0)=0, y’/(0)= 
26. y/+2y'+y=3t, y(0)=0, y'(0)= 


In exercises 27—32, use Laplace transforms to determine the displacement y(t) 
of the spring-mass system with spring constant k = 72 and mass m = 2 kg for 
the given forcing function f(t). Assume each time the system starts from rest; 
solve for y(t) in the cases where the spring constant c is (a) c= 0, (b) c= 2, 
(c) c = 24, and (d) c = 40, assuming consistent units. Sketch a plot of each 
solution. 


27. f(tj= 

28. f(t) = 10sin2t 

29. f(t) = 10sin 6t 

30. f(t) = 10[u(t) — u(t — 477)] 
31. f(t) = 102-9" 

32. f(t) = 1006(t) 


In exercises 33—38, consider an RLC circuit for which an inductor of L=1H 
and capacitor C = 0.01 F are present. For each given forcing function f (tf), use 
Laplace transforms to determine the charge Q(t) and current I(t) in the circuit 
at time t if initially Q(0) = 0 and I(0) = 0. Determine the charge and current in 
the cases where the resistance is (a) R= 0 Q, (b) R= 16 Q, (c) R= 20 Q, and 
(d) R=25Q, _ consistent units. Sketch a plot of each solution. 


33. f(t) = 
34. f(t) = 10sin 10t 

35. f(t) = 5sin 10¢t 

36. f(t) = 10[u(t) — u(t — 277)] 
37. f(t) = 106(t) 

38. f(t) = 20e~* 


5.6 More on the inverse Laplace transform 


In this section, we provide an overall summary of properties of the inverse 
transform and present some further practice with computations. We close with 
a discussion of how transforms and inverse transforms may be found using a 
computer algebra system. 

To begin, table 5.3 provides a list of familiar functions F(s) and their inverse 
transforms, as well as several key general properties of the inverse transform. 
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Table 5.3 
Inverse Laplace transforms of some basic func- 
tions and other fundamental properties. 


F(s) F(t) = £7" [F(s)] 
1/s" t"/n! 

1/(s—a) em 

s/(s? +k?) cos kt 

k/(s? +k’) sin kt 

s/(s* — k?) cosh kt 

k/(s? — k?) sinh kt 

aF(s) + bG(s) af (t) + bg(t) 
F(s—a) e"' f(t) 

eo" 6(t—a) 

e~ F(s) u(t — a)f(t— a) 


Most of the lines in the table are derived from taking the inverse perspective 
on statements in tables 5.1 and 5.2. While full tables of Laplace transforms 
typically number many pages, we present only a small collection for use in 
standard problems involving spring-mass systems and RLC circuits, leaving 
other examples for exploration in other sources or computer algebra systems. 


The next several examples demonstrate standard techniques in the computation 


of inverse transforms. 


Example 5.6.1 Determine £~![F(s)] for each of the following functions: 


co 2 4 
(a) F(s) = rea = 


—21s 


(b) F(s) = fy4g Ce (s? +25+5)(s? +9) 


Solution. (a) Because of the presence of e~*5 in F(s), we will use the second 
shifting property. But first, we find the partial fraction decomposition 
1 1 ih 1 


s(s +1)? ~~ gel. (s+1)2 
and note that 


ofetsl-e ldo 
s(s +1)? s stl (s+1)2 
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Now, in order to compute the inverse transform of the given function, we use 
the second shifting property to address the presence of e~** in each term and 
thus find that 


—2s 


(b) Partial fractions shows that 


F(s) 2 1/1 1 
iy — — 
st+4s2 2\s2 5244 


Using the inverses of familiar transforms of f(t) = t and f(t) = sin2t, we see 


1 ea 
L trio = 5 (+ ssin2t) 


(c) Given the function 


Ase~2"5 


a (s* 4+ 2s+5)(s? +9) 


we see that the presence of e~7”* implies the inverse of the second shifting 


property will be used. As is now custom, we first use partial fractions to break the 
rational part of F(s) into a sum of simpler expressions. Doing so and completing 
the square to re-express 42555, 


As 4 4s~18 4s—10 
(s?+2s+5)(s2?+9) 13 +9 9 5242545 


i (= : 18 a A(s+1) 14 
13 \s2+9 5249 (st+1)?2+4 (s+1)?+4 


Letting G(s) = 4s/(s* +2s+5)(s? +9), it now follows from familiar rules with 
inverse transforms and the first shifting property that 


L'IG(s)] = a cos3t+ eee 4 Posi = T o-tsin dt 
13 39 13 13 


Finally, since F(s) = e~°”* G(s), the second shifting property implies 


Lo [F(s)] = u(t 2m) ( 5609341 oa” dae 2n)) 


13 


4 i 
+ u(t — 27) (Fert 00s2¢e —20) — ae since — 2n)) 
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The 27 shift in each of the sine and cosine functions can be removed; for 
instance, cos 3(t — 277) = cos3t. Doing so throughout shows that 


Lo [F(s)] = u(t — 27) 


4 6 4 7 
(-3 cos3t+ 73 5in3e+ Go c0s2t — a sin2t 


There are certainly other properties of the inverse Laplace transform that we 
could study. For example, theorem 5.3.4 in inverse form allows us to say that if 
L~'[F(s)] = f(t) and f(0) = 0, then 

Lo {sF(s)] =f'(0 (5.6.1) 
While results like this are theoretically interesting and can occasionally enable 
us to determine inverse transforms in alternate ways, they are less useful 
in pragmatic terms when we think of our overarching goal: using Laplace 
transforms to solve initial-value problems. 

Indeed, our work throughout this chapter has given us a good overview of 
how Laplace transforms work, especially the role they play in solving initial- 
value problems. Of course, there are also many forcing functions we have not 
discussed for which Laplace transforms may be taken. There are books that 
contain lengthy tables of Laplace transforms and inverse transforms that we 
could, if necessary, consult. But because of the technology available to us, these 
tables have essentially been rendered obsolete. Most computer algebra systems 
are fully capable of computing Laplace transforms and their inverses, so we 
choose not to study methods for these more difficult calculations. The next 
example demonstrates one such function F(s) which is beyond the methods we 
have developed but that can easily be handled by a computer algebra system. 


Example 5.6.2 Find the inverse Laplace transform of 


F(s) = 2 


Solution. The partial fraction decomposition of F(s) is 

i ee + - a oP _ a 

se+1 (s*+1) se+4  (s*+4) 

Two of the terms in (5.6.2) are straightforward to invert, but the two involving 
squares of irreducible quadratic terms are not among familiar functions from 
our previous work. In the following subsection, we demonstrate how to use 
Maple to compute the inverse transform of such functions. These computations 
reveal that 


(5.6.2) 


-1 1 i ae 1 
L = —sint — —tcost 
(s? +1)? 2 2 


4 1 1, 1 
L~ | ——, | = = sin2t— -tcos2t 
(s? + 4)? 16 8 


and 
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From this work and (5.6.2), we find 


2 1 1 1 1 1 
Lo'LF(s)] =—<—sint sint — ~tcost+ —sin2t+ — sin2t — —tcos2t 
[F(S)] 3 is 2 3 16 8 


1, 1 9: 1 
= —-sint — —tcost + — sin2t — —tcos2t 
6 2 48 8 


Further discussion of how to use Maple to compute transforms and inverse 
transform follows in the next subsection. 


5.6.1 Laplace transforms and inverse transforms 
using Maple 


As we have noted, while we have computed Laplace transforms for a range of 
functions, there are many more examples we have not considered. Moreover, 
even for familiar functions, certain combinations of them can lead to tedious, 
involved calculations. Computer algebra systems such as Maple are fully capable 
of computing Laplace transforms of functions, as well as inverse transforms. 
Here we demonstrate the syntax required in the solution of the initial-value 
problem from example 5.5.4: 


y" +4y'+13y =2u(t—zm)sin3t, y(0)=1, y/(0)=0 (5.6.3) 
To begin, we load the int trans package in Maple. 


> with(inttrans) ; 


If, for example, we desire to use Maple to compute the Laplace transform of 
2u(t — 7) sin 3t, we use the syntax 


> laplace(2*Heaviside(t-Pi)*sin(3*t),t,s); 


This command results in the output 
6 es 
249 
which is precisely the transform we expect. 


After computing by hand the transform of the left-hand side of (5.6.3) and 
solving for Y(s), as shown in detail in example 5.5.4, we have 


¥(s) s+4 a e-Hs 3 
s) = —————_ —2e 
s?+ 45+ 13 (s2 +.9)(s2++4s5+ 13) 


Here, we may use Maple’s invlaplace command to determine £L~![Y(s)]. 
While we could choose to do so all at once, for simplicity of display we do so in 
two steps. First, 


> invlaplace((s+4)/(s*2 + 4*s + 13),s,t); 
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results in the output 
1 
32.) Bcos(3t) + 2sin(31)) (5.6.4) 


Similarly, for the second term in Y(s), we compute 


> invlaplace(2*exp(-Pi*s)*3/((s*2 + 9)*(s*2 + 4*s 
+ 13)),s,t); 


Maple produces the output 


1 
3 Heaviside(t — )(3cos(3t) — sin(3t) — gt titan) (3cos(3t) + sin(3t))) 
(5.6.5) 
which corresponds to our work in example 5.5.4. The sum of the two functions 


of t that have resulted from inverse transforms in (5.6.4) and (5.6.5) is precisely 
the solution to the IVP. 


Note that in computing the inverse transform (5.6.5), Maple has implicitly 
executed the partial fraction decomposition of the expression 
3 
(s? + 9)(s? + 4s+ 13) 
If we wish to find this explicitly, we can use the command 


> convert (3/((s*2 + 9)*(s*2 + 4*s + 13)), 
parfrac, s); 


which produces the output 
13—3s 1 94+3s 
40 s*+9 7 40 s+ 45+ 13 
In general, we see that to compute the Laplace transform of f(t) in Maple 
we use the syntax 


> laplace(f£(t),t,s); 
whereas to compute the inverse transform of F(s), we enter 
> invlaplace(F(s),s,t); 


Exercises 5.6 In exercises 1-9, find the inverse Laplace transform of the given 
function F(s) using familiar techniques or a computer algebra system. 


2s 
1 FQ)= Gyay 
2. F(s)= 


More on the inverse Laplace transform 


FS cas 

4, F(s) = oe ED 

5, F(s) = oe 
AS) = oa 

7. F(s)= Seren 
8. F(s) = cs ea 
8, F(a) ert 4 


s(s—1)(s? —5s +4) 
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In exercises 10-22, solve the stated initial-value problem using Laplace 
transforms (using a computer algebra system as necessary). Sketch a plot of 


each solution. 

10.y/+y=e'+te', y(0)=1 

ll. y”+4y =sin2t, y(0)=0, y(0)=1 

12. y’ + 4y =sin2t+6(t—6), y(0)=0, y(0)=1 

13. y” +4y =sin2t + 6(t —6)+8(t — 12), y(0)=0, y/(0)=1 
14. y’ + 9y = cos3t+ tcos3t, y(0)=0, y(0)=1 

15. y”+2y'+5y =e‘ sin2t, y(0)=0, y'(0)=1 

16. y’ +2y' +5y =e‘ sin2t + te’ sin 2t, y(0)=0, y/(0)=1 


17. y" +2y' +5y =e 'sin2t+ u(t —2) te‘ sin2t, y(0)=0, y’(0)=1 


18. y"+y'—2y=4e'+1, (0) = 1, -y/(0) =0 
19, y"+y' —2y =4e'+14+6(t-3), y(O)=1, y'(0)=0 
20. y+ y'—2y=4e'+u(t—3), y(O)=1, y/(0)=0 


21. y"4+2y'+5y=e 'sin2t+ te ‘sin2t+d(t—5), y(0)=0, y(0)=1 


22. y’ +2y' +5y = 13e' sint, y(0)=0, y'(0)=0 
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5.7 For further study 
5.7.1 Laplace transforms of infinite series 


If f(t) is a function of exponential order that is analytic® at t = 0 with an infinite 
radius of convergence, then f(t) may be expressed as a power series and also has 
a Laplace transform. It therefore follows that if 


1O=)> oe 
n=0 


then its transform is 
[o.@) [o.@) iT 
AAO wee =) tan er (5.7.1) 


We begin by exploring the transforms of some familiar functions through the 
use of infinite series. 


(a) Recall that f(t) = e! is analytic at t = 0 with series expansion 


oo tn 12 p 

_— ae pa jee aa 

e= a EG at (5.7.2) 
n= 


By taking the Laplace transform of the series (5.7.2) term-wise,’ show that 


[o,@) 


1 
Lle’]= oD a (5.7.3) 


n=0 
Then, recognize (5.7.3) as a geometric series to show that 
Lie] = a 
s—l1 
(b) Similarly, use the fact that f(t) = sin t has the series expansion 
Pee 
SS toe 


to show using infinite series that 


L[sin t] = 


stl 


© More on power series expansions of functions and the meaning of terms such as “analytic” may be 
found in Section 8.2. 

7 While the Laplace transform of a finite sum is the sum of the Laplace transforms of the individual 
terms, it is not obvious that this property holds for infinite sums. The formal justification that this is 
valid in what follows is beyond the scope of this text; the reader may assume that this step is valid, 
and proceed as directed. 
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In addition, develop the Laplace transform of f(t) = cost using the series 
expansion cost = 1 — t?/2!+ t*/4!—---. 

While power series expansions of such familiar functions as e, sint, 
and cost are important and offer a different perspective on the development 
of the transforms of these functions, power series are even more useful for 
working with functions that are more complicated. For example, if we seek the 
transform of 

ef —1 
{a 
none of the methods we have previously discussed apply. However, stan- 
dard techniques® with infinite series may be used to address functions 
such as (5.7.4). 


(5.7.4) 


(c) Use the standard power series expansion for e’ to show that 
f (t) = (e~* — 1)/t has the series expansion 


ae | t er Pp cag ea 
ea ad pnd 
t +5 31 X n| 


Then, compute the Laplace transform of the series expression to show that 


ef] 1 1 1 
L =——+ +... (5.7.5) 


t s  2s% 353 


(d) Even though the Laplace transform of an analytic function will result in an 
infinite sum involving negative powers of s, sometime we can recognize 
the transform as a familiar function. To see this in (5.7.5), use the known 
series expansion 


1 1 
Inf +x) =x— 52° +52) + 


and the substitution x = 1/s to show that 


lat aes aa, 


(e) From the standard series expansion for the function sin t, determine the 
Taylor series of 


sin t 
foH= eo (5.7.6) 
and hence compute the Laplace transform of (5.7.6). Then, use the 
expansion 
arctanx = x — ea dis gst pe teas 
3 5 7 


8 A review of the development of power series of functions can be found in section 8.2. 
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and an appropriate substitution to show that 
sint 1 
£| — | = arctan - 
t Ss 


(f) Use series techniques to show that 


5.7.2 Laplace transforms of periodic forcing 
functions 


Nonhomogeneous differential equations often involve periodic forcing func- 
tions. In section 4.5, we considered the effects of the forcing function f(t) = 
sinwt in connection with the natural frequency of a system. More generally, 
here we examine periodic forcing functions that are piecewise continuous. Such 
functions satisfy the relationship that for some value of a, 


f(t)=f(t+a)+f(t+2a) +f (t+ 3a) 
fees HF(E+ na) +o: (5.7.7) 


An example of such a function is shown in figure 5.13. Taking the Laplace 
transform of such a function f, we may write the transform as the infinite sum 
of integrals 


cif()) = [ flee“ dt 


3a 


a 2a 
=F ple ae | flees de | fet dt+--- (5.7.8) 
0 a 


2a 


ft) 


a 


AM 


Figure 5.13 A periodic function with 
period a that is piecewise continuous. 
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(a) Using the change of variables t = t + a in the second integral, t = 1 +2a 
in the third, and so on, show that 


Lif (t)] = [soe as [re + aje tt dr 


+f fet 2ae 4 gee (9.7.9) 
0 
(b) By replacing the integration variable t with t in (5.7.9), show that 
Lif(HJ=Tl+e *+ pes if f(t)e * dt (5.7.10) 
0 


Then, use the fact that the infinite series in (5.7.10) is geometric in order to 
conclude 


clf(t)] = a | foes dt (5.7.11) 


(c) Use (5.7.11) to determine the Laplace transform of the square wave 
function shown in figure 5.14. (The vertical lines shown in the graph are 
not actually part of the function’s graph; indeed, f is piecewise constant 
with value 3 on [0, 2) and value —3 on [—2, 4), and so on.) 

In particular, show that 


_3 1-es 
~ s 1+e-%s 


Lif (t)] 


where f(t) is the function pictured in figure 5.14. 


Figure 5.14 A square wave with amplitude 3 
and period 4. 
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(d) Consider the periodic function with period 27 given by 


{a= 


sint, if0<t<z 
0, ifm <t<20 


This function is called the half-rectified sine wave since it only consists of 
the top-half of the standard sine function. Sketch a graph of this function 
and show that its Laplace transform is 


l+e 7s 
(=e \F4+1) 


(e) Let a slightly damped spring-mass system be given with m= 1, c = 0.02, 
and k = 25, and be driven by a square-wave periodic forcing function f(t) 
with amplitude 5 and period 27. We will use Laplace transforms to solve 
the initial-value problem that governs this system under the assumption 
that the system starts from rest. 


Lif (t)] = 


(i) The stated problem is modeled by the initial-value problem 
y" +0.02y'+25y=f(t), y(0)=0, y/(0) =0 
Take Laplace transforms to show that Y(s) = £[y(t)] must satisfy the 
equation 
F(s) 


vs 
() = 370028425 


where F(s) = L[f(t)]. 


(5.7.12) 


(ii) While we have learned in (c) how to write the transform of a square 
wave function without using infinite series in its expression, it turns 
out for this problem that a series expansion is necessary for finding the 
inverse transform when solving the IVP. By writing the square wave 
function given in this problem in the form 


f(t) = 5u(t) — 10u(t — 7) + 10u(t — 277) — 10u(t — 377) 4+ --- 
show that 
5 
F(s) = LUf(t)] = —[1 — 2e775 + 2e- 775 — 2e 375 4..--] (5.7.13) 
s 
(iii) Explain why 
1 _ 1 
?+0.02s+25  (s+0.01)2+52 
(iv) Combine (5.7.12), (5.7.13), and (5.7.14) in order to conclude that 
y(t) =L“"TY(s)] 


(5.7.14) 


5 
oe Ee: ooiasy et 207... | (5.7.15) 
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Explain why we have to find the inverse transform in (5.7.15) 
term-by-term. 


(v) Compute the inverse transform of the first term 


BOE Fes. al 
in (5.7.15) given the partial fraction decomposition 
5 _ 0.2 2 0.25 + 0.004 
s[(s +0.01)? + 52] s (s+ 0.01)? +52 
(Hint: 0.2s + 0.004 = 0.2(s + 0.01) + 0.002) 
Conclude that 


yi (t) = 0.2 — e~ °°!" (0.2. cos 5t — 0.0004 sin 51) (5.7.16) 
(vi) Compute the inverse transform of the second term 
ai = 5 
w= 7e  Is+0.012 ral 
in (5.7.15) using (5.7.16) and the second shifting property. 


Using the fact that cos5(t — 7) = —cos5t and sin5(t — 7) = —sin5t, 
conclude that 


yo(t) = —2u(t —2) {0.2 + e907) (9.2. cos5t + 0.0004sin52)| 


= —2u(t —){0.2 + e°!” [0.2 — yo(t)]} (5.7.17) 
(vii) Compute the inverse transform of the third term 
5 
‘= co 2 —21s 
a) : ear | 
in (5.7.15) using (5.7.16) and the second shifting property. 


Using the fact that cos 5(t — 277) = —cos5t and 
sin 5(t — 277) = —sin5t, conclude that 


y3(t) = 2u(t — 27) {0.2 — e~ 9-0l(t—277) (9 9 cos 5t +0.0004 sin5t) 
= 2u(t —2m){0.2— e* [0.2 — yo(t)]} (5.7.18) 
(viii) So far, we have found the formula for y(t) valid up to t = 3z. In fact, 
y(t)=y(t), if0<t<z 
yt=yn(t)+yo(t), ifa<t<270 
y(t) =yilt) + yo(t) + ys(t), if20 <t < 30 
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Using y;(t) = 0.2 — 10.2 — yi (t)], together with (5.7.17) 
and (5.7.18), plus the fact that on 27 < t < 3m we know 
u(t —7) = u(t — 277) = 1, show that on 27 < t < 37, 


y(t) = 0.2 — [0.2 — yy (t)] {1+ 2e°0'" + 2697 } 
(ix) Using the patterns established in (5.7.17) and (5.7.18), explain why 
y(t) = y(t) + yalt) +--+ yn(t) 
= (-1)"0.2—[0.2— yi(#)] 
{1+ 20°07 4...4 22201") (5.7.19) 
is valid for nw < t <(n+1)z for any positive integer n 


(x) Letting z(t) = e~9ll(cos5t+0.002 sin 5t) and using the fact that 
1—x"1/]-—x=14+x+x?+---x", show that on nz <t <(n+1)z, 


2 2 e(n+1)0.01n 
) + 


5(1 — e001 5 ein) 2) (5.7.20) 


1 
t)=(-1)"(=- 
yin=(n( 
Explain why as t > oo, it follows that y(t) > oo. Using a computer 
algebra system, graph the solution function on several consecutive 
large intervals of width z, such as [200z, 2017], [2017 2027], etc., 
and discuss the behavior of the system. 


5.7.3 Laplace transforms of systems 


Recall that the standard initial-value problem for a system of first-order DEs is 
given in matrix form by 


x’ = Ax+f(t), x(0)=b (5.7.21) 


In the event that f is a continuous function, the variation of parameters 
technique applies. But, if f is a step function or otherwise piecewise defined, 
our earlier methods fail, and Laplace transforms may be used. Regardless, the 
Laplace transform can be a useful tool for systems for many of the same reasons 
it is for single DEs, such as the fact that it treats all linear systems in a uniform 
manner and incorporates the initial conditions immediately into the process of 
finding the solution. 

Since each of the three terms in the equation in (5.7.21) is a vector, Laplace 
transforms may be applied component-wise. For example, 


rom pf ()] [LIX] 
BOIS I} |= | a0 on 


_ | sX,(s) — x, (0) 


= Bs = 20) | a) 
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where we let X(s) denote the Laplace transform of the vector function x(t). 
Letting F(s) be the transform of the vector f(t), we may deduce from (5.7.21) 
and theorem 5.3.4 that 


sX(s) —x(0) = AX(s) + F(s) (5.7.22) 
(a) Solve (5.7.22) for X(s) to show that 
X(s) = Z(s)(F(s) +b) (5.7.23) 


where Z(s) = (sI— A)~! and b = x(0). Explain why we must assume that s 
is not an eigenvalue of A when we write X(s) in the form (5.7.23). 


(b) Next we solve an example system in step-by-step fashion. Consider the IVP 


2 
x’ =| & : J=+[%]: x0=|5| (5.7.24) 


(i) Compute F(s) and hence show that 


1 
F(s) +x(0) = | at ‘| 


Ss 


(ii) Use the given coefficient matrix A to compute Z(s) = (sI— A)~! and 


conclude? that 
1 s—3 0 
Z(s) = ————___— 
2 snez | -1 oe 


(iii) Compute X(s) using (5.7.23) to show that 


os s(s— 2) B 


(iv) Finally, use the inverse Laplace transform component-wise on X(s) 
(using standard inverse transform techniques) to find 


7 yak 
x(t) =L“[X(s)] = | | 


(c) Use Laplace transforms and the solution technique outlined in (b) above 
to find the solution of each system of IVPs below. 


@x=[ 1 7 fe+[ Sn] xo=[7] 
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Nonlinear systems of differential equations 


6.1 Motivating problems 


In our studies so far, we have seen that a variety of interesting physical situations 
can be modeled by linear systems of differential equations. Moreover, nearly all 
linear systems may be solved explicitly. But, many important phenomena are 
nonlinear in nature; in order to motivate our upcoming work with such systems, 
we consider two applications where nonlinear systems of equations arise. 

A pendulum is a mesmerizing phenomenon. Whether on a grandfather 
clock or in the hand of a hypnotist, there is something fascinating about its 
motion. It turns out that a nonlinear second-order differential equation (and 
hence a system of nonlinear first-order equations) models its behavior. To 
develop this differential equation, let a rigid arm of length L be attached to 
a point from which it may swing freely. In this discussion, we will assume for 
simplicity that no damping is present. Similarly, to simplify the physics we 
assume that the arm itself has negligible mass. Finally, we attach a mass m to the 
end of the rigid arm and set the pendulum in motion, as shown in figure 6.1. 

Weare interested in how the mass travels along a circular arc once the mass 
is set in motion. The quantities of interest to us are noted in figure 6.1; the 
variable 6 represents the angle (in radians) the arm makes with the vertical axis 
and s denotes the displacement of the center of the mass along the circular arc. 

Because the mass is traveling along a circular arc, it follows that s = L0@. 
Noting that both s and 6 are implicit functions of t, we can differentiate with 
respect to t and find s’(t) = L6’(t) and s(t) = LO”’(t). In particular, the velocity 
of the center of the mass along the arc is s/(t) and its acceleration is s’’(t). 
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y 
ae 
m 
s 
1% 
Figure 6.1 A simple pendulum. 
y 
mgcosO 
mgsin@ 


ae, 


Figure 6.2 Component of gravity’s force 
along the pendulum’s motion. 


Since the acceleration a(t) is given by a(t) = s(t), we have 


in ds _ #8 
seme ame) 


(6.1.1) 


Since we have assumed that there is no damping present, once the mass 
is set in motion the only force acting on the pendulum is gravity. Because we 
are studying the displacement, velocity, and acceleration of the mass along its 
path, we must consider the magnitude of the weight W = mg in the direction 
of motion. From figure 6.2, we see that gravity induces a force of magnitude 
W sin@ along the circular arc. Note, too, that this force opposes the motion of 


the pendulum, assuming s’(t) is positive. 
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From Newton’s second law, F = ma, it now follows that ma = —mgsin@, 
or 
a(t) = —gsin0(t) (6.1.2) 
Using the two equivalent expressions for acceleration in (6.1.1) and (6.1.2), it 
follows that 
do 
dt? 
If we assume that an initial displacement angle 0(0) = 6 and initial angular 
velocity 6’(0) = 6% are given, then after rearranging (6.1.3) it follows that 6 
satisfies the initial-value problem 


6" + sind = 0, 9(0) =6, 6’(0) =6 (6.1.4) 


=-—gsind (6.1.3) 


Because of the presence of sin@ in this equation, this second-order differential 
equation is nonlinear, which means that none of our previous solution methods 
apply. If we use the substitution x, = 0 and x) = 6” to recast (6.1.4) as a nonlinear 
system of first-order differential equations, then it turns out that the system has 
a natural graphical interpretation through its slope field, just as we saw with 
linear systems of differential equations. Using this substitution, we observe that 
the pendulum is governed by the system 


= i sin x] 
oe 


with initial conditions x; (0) = 49 and x2 (0) = 6}. Besides studying the associated 
slope field, we will also learn that it is possible to approximate this nonlinear 
system at key points with a linear system to better understand its behavior, 
particularly at any equilibrium points it may have. In subsequent sections, we 
will explore these issues in greater detail and return to this example involving 
the pendulum several times, including an investigation of what happens when 
friction is present. 

In addition to the pendulum, another system of nonlinear differential 
equations arises in the study of population dynamics. Let us consider a 
population W(t) of wolves (in hundreds) that prey upon a population M(t) 
of moose (in hundreds), where t is time measured in years. A good example of 
such a situation, and one that biologists have studied in detail, occurs on Isle 
Royale in Lake Superior. On this remote island, wolves are the only predator of 
moose and moose are essentially the only prey of wolves. 

Suppose that in the absence of moose, the wolves would die off at a rate 
proportional to their own number according to a differential equation such as 

dW 


— =-—0.75W 
dt 


In the presence of moose, however, we expect more of the wolves to be able to 
survive, and to do so at a rate proportional to the moose—wolf interactions since 
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these can result in food for the wolves. The number of moose—wolf interactions 
can be modeled by taking the product of M and W; only some fraction of such 
interactions will be beneficial to the wolves. Thus, the wolf population can be 
assumed to satisfy a differential equation of the form 


dw 
ae —0.75W +0.25MW (6.1.5) 


Likewise, in the absence of wolves, we would expect the number of moose to 
grow unencumbered (at least in the short term). We might, therefore, have a 
differential equation like 


dM _ 5M 
dt 


But with wolves around, some of the moose will die due to moose—wolf 
interactions, hence we assume the moose population satisfies an equation like 


dM 


Equations (6.1.5) and (6.1.6) lead to the system of nonlinear differential 
equations 


dw 
ie —0.75W +0.25MW 
dM 
— =0.5M —0.1MW 
dt 


Systems of this form (regardless of the values of the constants) are typically 
known as predator-prey or Lotka—Volterra equations. Factoring the right-hand 
side in each equation above, we see that the wolf and moose populations satisfy 


dw 

— = W(-0.75+0.25M) 
dt 

dM 

— = M(0.5—0.1W) 

dt 


from which it is evident that the system of differential equations has not 
only the obvious equilibrium point at the origin, but also one at (5,3). What 
kind of behavior should we expect for the wolf and moose populations for 
initial conditions near (5,3)? In particular, is this equilibrium point stable? 
Are there ways we can approximate this nonlinear system with a linear one? 
These questions and more are the focus of subsequent sections as we investigate 
nonlinear systems of DEs. Our in-depth study of linear systems of differential 
equations in chapter 3 will prove useful in the study of nonlinear systems: as we 
see in section 6.2, we can study the graphical behavior of solutions to nonlinear 
systems in the phase plane by plotting a direction field, just as we did with 
linear systems. Moreover, in section 6.3 we will study a process by which we 
can approximate the nonlinear system at a point by a linear system and use our 
understanding of the behavior of linear systems to make predictions about the 
nonlinear system. 
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6.2 Graphical behavior of solutions for 2 x 2 
nonlinear systems 


In our study of single first-order initial-value problems in chapter 2, we learned 
that every IVP associated with a linear differential equation with sufficiently 
well-behaved coefficient functions has a unique solution; moreover, we can 
determine an explicit formula for the solution. As we learned in chapter 3, 
essentially the same situation holds for linear systems of differential equations; 
those with constant coefficients and their corresponding IVPs can always be 
solved. However, in the case when the governing differential equation or system 
of equations is nonlinear, we are not guaranteed that solutions to initial-value 
problems exist, nor that they are unique when they do exist. In addition, as we 
now study nonlinear systems, we will find that even when unique solutions exist, 
we are usually unable to determine explicit formulas for them. 

We therefore turn again to graphical and numerical investigations of the 
qualitative properties of solutions to nonlinear systems in order to understand 
their short- and long-term behavior. To begin, let us choose an example through 
which we can develop intuition. We consider the system given by 


Seas (6.2.1) 
If we let 
«=| 30] 
and F: R* > R? be the function defined by 
F(x) = F(x), x) = (% — x7, %1 — 3) 
then it follows that we may view (6.2.1) as having the form 
x’ = F(x) (6.2.2) 


This is analogous to our work with linear systems of differential equations that 
may be expressed in the form x’ = Ax, where A is a matrix. In that setting, the 
right-hand side of the system is a linear function of x, but in (6.2.2), F(x) is 
not linear. Nonetheless, a graphical interpretation of the system remains both 
possible and enlightening. 

In section 3.4, we discussed the graphical behavior of a vector function. 
Here, we simply remind ourselves that for the system x’ = F(x) in (6.2.1), a 
solution x(t) is a vector function whose output lies in R* and whose graph is 
the curve that is traced out by the vectors x(t) at various times t. Moreover, the 
derivative x’(t) of x(t) is itself a vector function that indicates the instantaneous 
velocity of a particle traveling along the curve traced out by x(t). In particular, 
scalar multiples of x’(t) tell us the direction of motion or flow along the solution 
curve as time increases. 
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We therefore turn again to direction fields to study the flow of the solution 
curves through the vector field generated by the system of differential equations. 
In particular, (6.2.2) indicates how, for any point (x, x2) in the plane, we can 
easily compute x’ = F(x;, x2) at that point, and hence know the direction of the 
flow of the solution curve that passes through that point. Using a computer 
algebra system to execute these computations repeatedly at points sampled 
throughout the plane, we can view the direction field for the nonlinear system, 
which is analogous to the direction field for a linear system. A direction field 
for (6.2.1) is shown in figure 6.3. 

The x;—x, plane is again called the phase plane; the independent variable t 
remains implicit in the flow, while the behavior of the curve relative to the 
coordinate axes demonstrates the interrelationship among the components 
x,(t) and x(t) of the solution x(t). Sample solution curves, such those plotted 
in figure 6.4 are typically called trajectories. In section 6.4 we will learn how to 
construct trajectories for systems through numerical approximation techniques 
such as Euler’s method. 

From figures 6.3 and 6.4, it appears that the system (6.2.1) has three 
equilibrium solutions. Specifically, the behavior of trajectories suggests the 
possibilities of equilibria at (—1,—1), (0,0), and (1,1). We can confirm this 
algebraically by setting x’ = 0 and solving the resulting nonlinear system of 
equations 


0=m—-xP (6.2.3) 
0=xy-x% (6.2.4) 
35 
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Figure 6.3 The direction field for the system 
x’ = F(x) given in (6.2.1). 
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Figure 6.4 The direction field for the system 
x’ = F(x) given in (6.2.1) with three trajectories. 


Equation (6.2.3) implies that x. = x?. Substituting this result in (6.2.4), it follows 
that 


0= x, — (x7)? 
Factoring, we see 
= 8) 4 4y 2 2 4 
0=x(1—28) = (1 — xf + xf) = (1-2) 4.2) + x) 


from which we determine that x; = 0,1, or —1. Recalling that 1 = - , the 
corresponding x2-values are x2 = 0,1, and —1, and we have found that the 
equilibrium points of the system (6.2.1) are indeed (—1, —1), (0, 0), and (1, 1). 

Here, we see another distinction between linear and nonlinear systems of 
differential equations. For a linear system x’ = Ax, the search for equilibrium 
solutions means we must solve Ax = 0, which we know has either a unique 
solution or infinitely many solutions. With nonlinear systems, it is possible 
that any number of equilibrium solutions exist (from none to infinitely many). 
Moreover, there are no guarantees that we can even expect to analytically solve 
the resulting system of nonlinear algebraic equations to find such equilibria. 

When we do find equilibrium solutions to a system, it is natural to ask 
about their stability. For example, for the equilibrium solution (0, 0) to (6.2.1), 
we might observe from figure 6.3 that the origin seems to exhibit behavior similar 
to a saddle point and therefore may be unstable. To investigate this further, one 
option is to see if there is a linear system of differential equations to which we 
can compare (6.2.1). For x; and x2 near zero, observe that both x and ee are 
extremely small, so that in this region close to the origin it is reasonable for us 
to say that 


(6.2.5) 
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In particular, note that the approximate system is linear, and we can write 
x’ = Ax, for x near 0 with 


0 1 
=|? 3 (28) 


The eigenvalues of the matrix A are 4; = —1 and Az = 1 with corresponding 
eigenvectors vj = [—1 1] and v7 =[1 1]". Due to the fact that the eigenvalues 
are real and of opposing signs, it follows that the origin is indeed a saddle point 
for this approximating linear system and is therefore unstable. The phase plane 
for the linear system corresponding to (6.2.6) near 0 is displayed in figure 6.5. 
This behavior is consistent with that observed near the origin in figure 6.3. 
We will call the system x’ = Ax, where A is given by (6.2.6), the linearization 
of (6.2.1) near 0. In section 6.3, we will study this approximation to a nonlinear 
system of differential equations near any particular point of interest to us. 

We close this section with two examples of nonlinear systems in which 
we determine all equilibrium solutions and examine the graphical behavior of 
solutions near the equilibria. 


Example 6.2.1 Consider the system of differential equations given by 


x} = sinx, 
; ; (6.2.7) 
X_ = X2 — X} 


Determine all equilibrium solutions of the system, plot the direction field, and 
discuss the behavior of solutions near at least two of the equilibrium solutions. 
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Figure 6.5 The direction field for the linear 
system x’ = Ax given in (6.2.5). 
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Solution. To find the equilibrium solutions, we set x; = x, = 0 and solve the 
system of equations 


0 = sin x) (6.2.8) 
0=%—x? (6.2.9) 


Equation (6.2.8) implies that x. must be any integer multiple of 7, while (6.2.9) 
shows that x; and x must satisfy the relationship x? = x). This latter equation 
implies that x. must be non-negative, and therefore with x) = km for any non- 
negative integer k, it follows that x; = +/kz and we have equilibrium solutions 
of the form (kz, ks), (—V/ka, ks) for k = 0,1,2,.... 

An appropriate window in which to plot the direction field for this system 
might therefore be [—3, 3] x [—1, 8], as this will include the five equilibrium 
solutions (0,0), (—/z,7), (/m, 2), (—V2z, 27), and (./27, 27). Plotting 
the direction field, as shown in figure 6.6, we see that the system appears to 
demonstrate familiar behavior around the equilibrium solutions. For example, 
at the solutions (./7,) and (—/2z, 27), each seems to be a saddle point, 
based on the behavior of trajectories nearby. In addition, at the equilibrium 
points (—./7, 2) and (/27,, 27), the system appears to demonstrate spiraling 
behavior where the equilibria might act as stable centers or possibly as unstable 
spiral sources. Based on the periodicity of the sine function, we can reasonably 
expect that we would see similar behavior demonstrated at other equilibrium 
points of the form (+ kz, kr), for k =3,4,.... Note further that all equilibria 
lie along the parabola x) = xj, as dictated by (6.2.9). Finally, it is evident that 
(0, 0) isan unstable equilibrium, though the precise behavior of solutions nearby 
is not entirely clear from the plot. 
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Figure 6.6 The direction field for 
the system (6.2.7) with equilibrium 
points (0,0), (—/z,7), (/7,7), 
(—/27, 277), and (./27, 27). 
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Indeed, it is apparent that we desire more precision, and not just in the vicinity 
of (0,0); our study of the linearization of a system of nonlinear differential 
equations in the next section will enable a much more rigorous understanding 
of a system’s behavior near any equilibrium point. 


Example 6.2.2 Consider the system of differential equations given by 


/ 2 
xX = xX +X] 
; : (6.2.10) 
X = —2xX2 + Xx] 
Determine all equilibrium solutions of the system, plot the direction field, and 
discuss the behavior of solutions near at least two of the equilibrium solutions. 


Solution. In the standard way, to find the equilibrium solutions we set x; = 
x, = 0 and solve the nonlinear system of equations 


O=—x +x1x4 = x1 (—1+ x) (6.2.11) 
0 = —2x2 + 2x9x1 = X)(-24 x1) (6.2.12) 


From (6.2.12), we see that either x. = 0 or x; = 2. If x. = 0, substituting this 
value for x2 in (6.2.11), it follows that x; = 0, so one equilibrium solution is 
(0, 0). If x; = 2, then (6.2.11) implies that —1+ ee = 0, which in turn shows 
that x. = +1. Thus, two additional equilibrium solutions have been found: (2, 1) 
and (2, —1). 

A reasonable window for plotting the direction field for this system is 
[—2, 4] x [—3, 3], since this will include the three equilibrium solutions we 
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Figure 6.7 The direction field for the sys- 
tem (6.2.10) with equilibrium points (0,0), 
(2,1), and (2,—1). 
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have found at (0,0), (2, 1), and (2, —1). As we see in figure 6.7, it appears that 
(0, 0) is a stable attracting fixed point and that both coordinate axes are straight- 
line solutions. This observation is not surprising if we also think about linear 
approximations: for x; and x near zero, x96 and x;x will be extremely small, 
and thus for such values the nonlinear system (6.2.10) can be approximated by 
the linear system 
/ 

‘ a (6.2.13) 

xX) = —2x2 
The linear system (6.2.13) has the obvious solutions x;(t) = e~! and x2(t) = 
e~*', which lead to the observed behavior near (0,0) in the nonlinear system. 
From figure 6.7, it also appears that the equilibrium points (2,1) and (2, —1) 
are saddle points. 


From all of our work in this section, we see that equilibrium solutions remain a 
vital part of our understanding of any system, whether linear or not. In addition, 
the picture painted by the direction field is fundamental to understanding the 
behavior of solutions to a nonlinear system. And yet, we are left desiring more 
detail than the direction field can provide. In section 6.3 we will develop the 
concept of the linearization of a system in order to link our understanding of 
linear systems to the behavior of nonlinear systems near equilibrium points. 
Furthermore, in section 6.4, we will generalize Euler’s method for single 
differential equations in order to apply it to systems to generate approximate 
solutions to solutions. 


6.2.1 Plotting direction fields of nonlinear systems 
using Maple 


The Maple syntax used to generate the plots in this section is essentially identical 
to that discussed for direction fields for linear systems in section 3.4.1. As always, 
we use the DEtoo1s package, and load it with the command 


> with(DEtools): 


To define the system of differential equations from example 6.2.1 in Maple, we 
use the command 


> sys := diff(x[1](t),t) = sin(x[2](t)), 
Giff(x[1](t),t) = x[2](t) - x[1] (t)*2; 


The system of differential equations of interest is now stored in “sys.” The 
direction field may now be generated by the command 


> DEplot([sys], [x[1](t),x[2](t)], t=-1..1, 
x[1]=-3..3, x[2]=-1..8, arrows=large, color=gray) ; 
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In plots in section 6.2, we have also included the equilibrium points. These 
may be generated by the pointplot command, which requires us to load the 
plots package. For example, the syntax 


> with(plots): pointplot([0,0], [sqrt(Pi),Pi], 
[-sqrt(Pi),Pi], [sqrt(2*Pi),2*Pi], [-sqrt(2*Pi), 
2*Pi], symbol=circle, symbolsize=7) ; 


will produce a plot of just these five points in the plane. To superimpose these 
points on the direction field, we can assign names to each plot and then display 
them together. Giving the respective plots the names DF and EQsol, we can 
use the display command as follows. Note the use of colons, rather than 
semicolons, to suppress output when we assign names to the plots. 


> DF := DEplot([sys], [x[1](t),x[2](t)], t=-1..1, 
x[1]=-3..3, x[2]=-1..8, arrows=large, color=gray) : 
> EQsol := pointplot( [0,0], [sqrt(Pi),Pil], 


[-sqrt(Pi),Pi], [sqrt(2*Pi),2*Pi], [-sqrt(2*Pi), 
2*Pi], symbol=circle, symbolsize=7): 
> display(DF, EQsol); 


This combination of commands results in the output shown at left in 
figure 6.8. 


If desired, we can now sketch trajectories by hand. Maple has the capacity to 
include such trajectories, given initial conditions. For example, if we are given the 
initial conditions x(0) = (2, 6) and (—2, 6), we can modify the earlier DEplot 
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Figure 6.8 At left, the direction field for the system (6.2.7) with equilibrium 
points (0,0), (—/m,2), (/m,m), (-V2m,2m), and (/27,2m). At the 
right, the same direction field with trajectories through (2,6) and (—2,6) 
is included. 
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command to 


> DEplot([sys], [x[1](t),x[2](t)], t=-2..2, 
x[1]=- 3, x[2]=-1..8, arrows=medium, color=gray) , 
[{[x[1] (0)=2,x[2](0)=6], [x[1] (0)=-2,x[2] (0)=6]]); 


This most recent command, when saved and displayed simultaneously with the 
above plot of equilibrium solutions, results in the righthand plot in figure 6.8. 

As a reminder, we always expect to experiment some with the window in 
which the plot is displayed: the range of x- and y-values certainly affects how 
clearly the direction field is revealed, and the range of t-values impacts how 
much of each trajectory is plotted. As the most recent section shows, a study of 
a system’s equilibrium points is a helpful guide for choosing a window in which 
to display a plot. 


Exercises 6.2 

In exercises 1-7, (a) determine all equilibrium solutions, (b) use Maple to plot 
the direction field, and (c) from the direction field, visually estimate whether 
equilibrium solutions are stable or unstable and discuss the long-term behavior 
of solutions. 


Ll. xf = x) — 2x1 x9 
x) = 4x1 x2 — x] 
2x = 4 — x} 

x =1l-xm+x 
3. xi = cosx) 

x, =1-sinx 
4, xi = 2x1 — x2 
= —4x, + 2x) 
5. xi =e % 

x = 1/1 +27) 
6. x} = In(2+ x) 
xi = xi +x 


7. x, =x— x! 


Gy = X= 8x5 
8. Recall from section 6.1 that the nonlinear system of differential equations 
W' = —0.75W +0.25MW 
M’=0.5M —0.1MW 
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models the numbers of wolves and moose (each measured in hundreds) in 
a predator—prey situation. Determine all equilibrium solutions to this 
system, plot an appropriate direction field in a computer algebra system, 
and discuss the apparent long-term behavior of the wolf and moose 
populations. 


9. Recall that if x; = 0 is the angle that the arm of a pendulum forms with the 
positive x-axis (as shown in figure 6.2) and x) = x, = 0’, then x; and x2 
satisfy the nonlinear system of differential equations 


x) = x2 

/ § |. 

xX» = —— Sin x, 
E 


Let g = 9.8 m/s* and assume that the length of the arm is L = 2m. 
Determine all equilibrium solutions to this system, plot an appropriate 
direction field in a computer algebra system, and discuss the long-term 
behavior of solutions to the system. Be sure to relate your answers directly 
to the behavior of the pendulum and corresponding initial conditions. 


6.3 Linear approximations of nonlinear systems 


In our first look at nonlinear systems in the preceding section, we considered 
the system 


x =x—-x? 
; (6.3.1) 
=m -% 
and observed informally that near the origin where x © 0, we can drop the x? 
and x} terms so that (6.3.1) can be approximated by the linear system x’ = Ax 


where 
0 1 
a=|j A (6.3.2) 


In this section, we make this notion of linear approximation of nonlinear systems 
more precise and use this approach to classify the stability of equilibria of 
nonlinear systems. 

An important idea in calculus is that all well-behaved functions are locally 
linear. That is, they appear linear when viewed up close; the line the function 
emulates is the tangent line to the curve at the point on which we focus. In 
particular, for a function f (x) that is differentiable at the value x = a, f (x) © L(x) 
for x near a, where 


L(x) =f(a) +f’ (a)(x—a) (6.3.3) 


The function L(x) is usually called the tangent line approximation or linearization 
of f atx=a. 


Linear approximations of nonlinear systems 401 


We encounter the very same ideas in multivariable calculus. For a 
differentiable vector function r : R > R? given by 


r(t)=| g(t) 
h(t) 


for values of t near some fixed value a, the curve in space that r(t) generates 
can be approximated by the tangent line to the curve. In particular, r(t) ~ L(t) 
where 


f(a) +f'(a)(t— a) 
L(t) =r(a)+4r'(a)(t—a)=| g(a) +Q'(a)(t—a) (6.3.4) 
h(a) + h'(a)(t — a) 


for t near a. As in the case of the scalar function f, L is called the tangent line 
approximation or linearization of r at t =a. 

Similarly for a differentiable real-valued function of several variables F : 
R? > R given by z = F(x, y), F(x, y) can be approximated by its tangent plane 
for (x,y) near some fixed point (a, b). That is, we have the approximation 
F(x, y) © L(x, y) where 


L(x, y) = f(a, b) + f(a, b)(x — a) + f(a, b)(y — b) (6.3.5) 


L is called the tangent plane approximation or linearization of f at (a, b). 

There is obviously a great deal of similarity in the algebraic forms of the 
linear approximations given in (6.3.3), (6.3.4), and (6.3.5). How can we apply 
these ideas to systems of nonlinear differential equations? The next example, in 
which we reconsider (6.3.1), suggests one approach. Because of the pending use 
of partial derivatives, we will temporarily use the notation x = [x1 x]! = [x yl". 


Example 6.3.1 Consider the system of differential equations 


x =f(xy=y—-x? 


y =g(x,y)=x-y? 
Determine linear approximations to both f(x, y) and g(x, y) at the point (1, 1). 
Then explain how these linear combinations may be combined to form an 
overall linear approximation of (6.3.6) near (1, 1). 


(6.3.6) 


Solution. In section 6.2, we considered this same system (using x; and x for 
the functions, instead of x and y) and learned that the equilibrium solutions 
to the system are (—1,—1), (0,0), and (1,1). As noted at the start of this 
section, we have already considered a linear approximation of the system 
at (0,0). Here, we focus on the behavior of solutions near the equilibrium 
solution (1, 1). 

To first approximate x’ = f(x, y) = y — x? near (1, 1), we use (6.3.5) to find 
the tangent plane approximation. Noting that f;(x, y) = —3x? and f,(x, y) =1, 
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it follows that f,(1,1) = —3 and f,(1,1) = 1. Moreover, f(1,1) = 0 since 
(1, 1) is an equilibrium solution of the system. Now, it follows that for (x, y) 
near (1, 1), 


fan *f,D+6K0, D(e-)+h0, Diy - 1) =0-3(x-1)+1(y—-1) 
(6.3.7) 


Similar ideas applied to y’ = g(x, y) = x — y? show that for (x, y) near (1, 1), 


g(x y) © g, D+ ge, D(e-D +g, D(y— 1) = 04+ 1(x— 1) - 3(y— 1) 
(6.3.8) 


If we now consider the overall system (6.3.6), for (x, y) near (1, 1) we have the 
approximation 


x! = f(x,y) ¥ —3(x—-1)+1(y-1) 
y =g(xy)* 1(x-1)—3(y—-1) (6.3.9) 


Using the fact that both equations in (6.3.9) are linear and writing this system 
in matrix form with x = [x y)', we have 


ef? Jeo-[3 Jefe EY 
“raleb] eve 


Hence we have approximated the original nonlinear system with a linear one 
by writing it in the form x’ © A(x — a) = Ax +b, where b = —Aa, for x 
near a. 


Because we have found that we may approximate the system (6.3.6) with 
the linear system (6.3.10), we can now use our understanding of linear 
systems to determine the behavior of the nonlinear system near the chosen 
equilibrium point. Specifically, the fact that the eigenvalues of the matrix A 
in (6.3.10) are A = —2 and A = —4 tells us that the equilibrium solution (1, 1) 
of (6.3.1) is a stable, attracting node, as we initially conjectured graphically 
from figure 6.4. 

Moreover, the approach we have taken in example 6.3.1 may certainly be 
generalized. Any nonlinear system of two differential equations may be written 
in the form 


x’ = F(x) (6.3.11) 


where F is a function of the form F(x) = F(x, y) = (f(x, y), g(x, y)). Given an 
equilibrium solution of (6.3.11) at a = (a, b), notice that F(a) = 0; in particular, 


f(a, b) = g(a, b) =0. 
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If, as in example 6.3.1, we approximate f and g near (a, b) with 


f(x,y) © f(a, b) + f(a, b)(x — a) + fy(a, b)(y — 8) 
= f(a, b)(x — a) + fy(a, b)(y — b) 
g(x,y) © g(a, b) + gx(a, b)(x — a) + g,(a, b)(y — b) 
= g(a, b)(x — a) + gy(a, b)(y — b) 
we observe that in matrix form we have 


x’ = F(x) 


= Beal 
g(x,y) 

sy be b)(x — a) + f,(a, b)(y — b) 
&(a, b)(x — a) + gy(a, b)(y — b) 


_| fcla,b) fy(a,b)}| x—a 
&(a,b) gy(a,b)|| y—b 
In matrix notation, we have written that x’ = F(x) © J(a)(x — a) for x near a, 


where a is an equilibrium point of the original system and J(a) is a matrix with 
constant entries. The matrix J(a), which is defined by 


_ | fe(a,b) f(a, b) 
yay= [eh Sie (6.3.12) 


is known as the Jacobian matrix of the function F evaluated at the point (a, b). 

More generally, for any differentiable function F: R” > R” given by F(x) = 
F(x], ..., Xn) = (fi(41,---,Xn)s---s fin(%1,---,%X)), the Jacobian matrix J(x) is 
given by 


be Of, /Ix, ++ | 
Ofp/dx1 dOf2/dx. --- Of2/OXn 


| (6.3.13) 


| afin /dx Bfa/Bx can afin / dx | 


The Jacobian enables us to write the linearization of any differentiable function 
F for x near a point a as 


F(x) © F(a) +J(a)(x — a) (6.3.14) 


which is remarkably similar to the tangent line approximation (6.3.3). Note that 
we must evaluate the Jacobian matrix at the point a of interest; moreover, if we 
are working with a nonlinear system of differential equations with equilibrium 
point a, it follows that F(a) = 0, so that we have 


x’ = F(x) © J(a)(x —a) (6.3.15) 
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This entire discussion of linearizing nonlinear systems is important for several 
reasons. One is that it demonstrates how we can take a problem we do not 
fully understand (the nonlinear system) and gain more knowledge of it by 
approximating the system near a point of interest with a simpler (linear) system 
that we do understand. Moreover, because we have completely classified the 
stability of equilibria of linear systems through the eigenvalues of the system’s 
matrix, we classify the equilibria of nonlinear systems by doing so for the 
corresponding linearization. We will use the same terminology and classification 
scheme for equilibria of nonlinear systems that we established for linear ones 
in sections 3.4 and 3.5. Two examples now follow to demonstrate these ideas in 
greater detail. 


Example 6.3.2 Given the system of differential equations 


determine all equilibrium points of the system, evaluate the Jacobian at each 
equilibrium point, and find a corresponding linearization of the system in order 
to analyze the behavior of trajectories near each equilibrium point and the 
stability of equilibria. Finally, plot the direction field of the given system to 
confirm the observations made. 


Solution. First, we observe that x’ = F(x) for 
F(x) = F(x, x2) = (f (x1, x2), g(x1, x2)) = (9x2 — x3, 1) 
Setting x’ = 0, it follows that x; = 0 and x)(9 — x)) = 0, so that the equilibrium 


points of the system are (0, 0) and (0, 9). 
Taking the appropriate partial derivatives, the Jacobian of F is 


I(x) = It : Pad 


Therefore, for values of x; and x) near the equilibrium point a = (0, 0) = 0, we 
have that x’ = F(x) + J(0)(x — 0), or 


pg [03] 2 
mae (na 


For this linear system, the eigenvalues of the matrix J(0) are A = 3 and A = —3, so 
the origin is a saddle point and therefore unstable. Moreover, we expect there to 
be two approximately straight-line solutions (along the respective eigenvectors 
of J(0)) that pass through the origin, along one of which the solution tends 
toward (0, 0) while on the other the solution is repelled away from (0, 0). 

For x; and x near the equilibrium point a = (0, 9), we have that x’ = F(x) © 


J(a)(x — a), or 
,~|9 —9}}x-O] JO —-9 81 
eft o]Lacs]= Lr os+[5) 
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Figure 6.9 The direction field for 
Example 6.3.2. 


For this nonhomogeneous linear system, the eigenvalues of the matrix J(0, 9) 
are A = 3iand A = —3i. Because the eigenvalues are purely imaginary, it follows 
that the equilibrium point (0, 9) is a stable center. Nearby this point, we expect 
to see trajectories orbit the point in approximately elliptical loops. 

All of our observations are confirmed by the graphical behavior evidenced 
in figure 6.9. 


Example 6.3.3 For the system of differential equations 


\ : 
x; = sinx, 
5 (6.3.16) 


X=) Xy 
determine all equilibrium points of the system, evaluate the Jacobian at each 
equilibrium point, and find a corresponding linearization of the system in order 
to analyze the behavior of trajectories near each equilibrium point and the 
stability of equilibria. Finally, plot the direction field of the given system to 
confirm the observations made. 


Solution. The given system is the same one that we studied in example 6.2.1 
in the preceding section. There we discovered that for any equilibrium solution 
xX = (X1, X2), X2 must be any integer multiple of zm and x = x2, so that x) must 
be non-negative. Thus, the equilibrium solutions have the form (/ ka, kv), 
(—J kx, kr) fork =0,1,2,.... 

Letting x’ = F(x) = (sin x, x) — ee it follows that the Jacobian of F is 
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For values of x; and x2 near the equilibrium point a = (0, 0) = 0, we have that 
x’ = F(x) © J(0)(x — 0), or 
x & cae x 
“10 1 


The eigenvalues of the matrix J(0) are A = 0 and A = 1, so the origin is unstable, 
because the real eigenvalue A = 1 > 0 will drive solutions away from the origin 
as t > oo. Moreover, because 4 = 0 is an eigenvalue of J(0), it also follows that 
all solutions near 0 are approximately straight-line solutions. 

For x; and x near the equilibrium point a = (./7, 77), we have that x’ = 
F(x) © J(a)(x — a), or 


L.2 UPsS-L el 


The eigenvalues of the matrix J(./7,7) are approximately 4 = 2.448 and 
A = —1.448, and so the equilibrium point (./7, 1) isa saddle point and unstable. 
However, if we consider the equilibrium point a = (—./7, 2), we have that 
x’ = F(x) © J(a)(x —a), or 


bye alse ]=bve a+] 
2/m 1 m—-a | |2/r 1 1 
In this case, the eigenvalues of the matrix J(—./7,7) are approximately 
A = 0.5 + 1.8157. Because these complex eigenvalues have positive real parts, it 
follows that the equilibrium solution (—./7., 7) isa spiral source and is unstable. 
If we continue exploring equilibrium points of the form (+ kz, kz), we 
can show through the Jacobian that whenever k is odd, the point (/ kx, kr) isa 
saddle point and the point (—/ksr, kz) is a spiral source. Conversely, whenever 
k is even, (kz, kr) isa spiral source and the point (—V/ ks, kor) is a saddle. In 
particular, every equilibrium point of the system is unstable. 
These observations are all confirmed in the direction field shown in 
figure 6.10. 


Through linear approximation, the tools we developed for linear systems 
enable us to understand and classify the stability of equilibria and behavior of 
solutions near equilibrium points for nonlinear systems. In the next section, we 
will explore how to actually compute approximate solutions via Euler’s method 
for systems. 


Exercises 6.3 

In exercises 1- 6, find the Jacobian of the given function, F. 
L. F(x, x2) = (x7 +2, x1 — x3) 
2. F(x), x2) = (e2*!2, cos x; + sin x) 


3. F(x), x2) = (x2 — 2%) x2, 4x1 x2 — x1) 
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Figure 6.10 The direction field for 
the system (6.3.16) with equilibrium 
points (0,0), (—/z,7), (/7,7), 
(—/27, 2m), and (2s, 27). 


4. P(x, x2) = (4-23, 1— x7) 
5. F(x, x2,%3) = (1/(1 + x2 + x2 + x2), e123, 2xy — 3x? + x4) 
6. F(x, x2, x3) = (3x1 — x2 + 4x3, x1 + x2 — 2x3, —2x) + 5x2 — x3) 


In exercises 7-10, find the linearization of the given function, F(x), x2), at the 
given point a. 


7. F(x, x2) = x? + x2, x1 — x3), a= (1, -1) 

8. F(x,, x2) = (xe, cosxy +sinx,), a= (m/2,0) 

9. F(x1, x2) = (x — 2x12, 4x12 — x1), a=(1/2,1/4) 
10. F(x), %) =(4-x3,1—x7), a=(-1,2) 


In exercises 11-17, find all equilibrium points of the system, determine 
the linearization of the given system near each equilibrium point, classify the 
stability of each equilibrium point, and compare your work to a plot of the 
direction field for the system.' 


Ll. x} = 2 —- 2m x 
% = 4x x%)— x 
12. 4 = 4-4 


x, = 1—x, +x 


! Note that in the exercises of section 6.2, equilibrium solutions were found and direction fields were 
plotted in exercises 1-7, which correspond to the same systems of differential equations given here. 
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13. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


Nonlinear systems of differential equations 


X, = cosxy 

x, =1—-sinx, 
x, = 2m — 2» 

x, = —4x) + 2x 
x Se? 

x, = 1/(1+x?) 
x, = In(2+ x) 


2 
Hy = XP +X 


—_ =: 
X= — x 
xi = x) — 8x2 
2 = X1 — OX) 


Recall from section 6.1 that the nonlinear system of differential equations 
W' = —0.75W +0.25MW 


M’=0.5M —0.1MW 


models the numbers of wolves and moose (each measured in hundreds) in 
a predator-prey situation. Determine the linearization of the system near 
the nonzero equilibrium solution, classify the stability of this equilibrium, 
and discuss the long-term behavior of the wolf and moose populations.” 


Recall that if x; = 6 is the angle that the arm of a pendulum forms with the 
positive x-axis (as shown in figure 6.2) and x2. = | = 6’, then x; and x» 
satisfy the nonlinear system of differential equations 

x = x2 


/ § |. 
Xx) = —— sin x] 
L 


Let g = 9.8 m/s* and L = 2 m. Determine the linearization of the system 
near the equilibrium solution at zero and at least one other equilibrium 
solution, classify the stability of these equilibria, and discuss the long-term 
behavior of the pendulum. Be sure to relate your answers directly to the 
behavior of the pendulum and corresponding initial conditions. 


In example 6.2.2, we considered the system of differential equations 
given by 

x =—xt+ seats 

x4 = —2x2 + x2x1 
Determine the linearization of the system near each equilibrium solution, 
classify the stability of each equilibrium point, and discuss the behavior of 
solutions nearby. 


2 In the exercises of section 6.2, equilibrium solutions were found and the direction field was plotted 
for this system in exercise 8; similarly, see the results of exercise 9 in section 6.2 for use in the problem 19 
below. 
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6.4 Euler’s method for nonlinear systems 


Just as we experienced with single nonlinear initial-value problems such as 
y=ye’+1,  y(0)=1 (6.4.1) 

or 

y=trty H+, y(0)=-1 (6.4.2) 
that we could not solve explicitly, in the past two sections we have encountered 
systems of nonlinear differential equations for which solutions to corresponding 
initial-value problems cannot be determined analytically. We therefore desire 
to explore ways to estimate solutions to these problems. 

For IVPs such as (6.4.1) and (6.4.2), we know that we may estimate a 
solution to the problem through Euler’s method. Recall from Section 2.6 that 
for any first-order IVP in the form y’ = f(t, y), y(t) = yo, given a step-size h 
we are able to generate the sequence of points (t1, v1), ..., (t1, ¥n) such that 


t4i1=tith and yn41=Vnthf(tr, yn), forn>0 (6.4.3) 


where y, © y(t,). That is, y, approximates the solution y to the initial-value 
problem at the point where t = th. 

To explore how we can extend Euler’s method to systems of differential 
equations, let us consider the initial-value problem given by 


x’ =9y—y", x(0)=1 


an ous (6.4.4) 


Here, we choose to use the notation x = [x al rather than [x; x:]! due to the 
fact that we will be using subscripts to label approximations to the component 
solutions x(t) and y(t). Keeping in mind that x and y are each implicit functions 
of t, we can view (6.4.4) as being of the form 


x =f(x,y,t), x(t) = x 

y’ =g(x,y,t), y(t) =yo 

To see how to approximate solutions to this system of IVPs, let us reconsider 

our earlier studies of single differential equations. In section 2.6, we considered 

the equation y’ = f(t, y) ina first-order IVP and emphasized the fact that Euler’s 

method relies on following the tangent line approximation to y(t) at each step. 

In particular, if we have some approximation y,, to the solution y at the t-value 

ty, then to move along the tangent line to the next approximation (t)+41, Yn+1)> 
it follows that 


(6.4.5) 


Vnt1 =Vnt Ay 


A 
= yn o> At 


where m is the slope at each step of our approximation given by m= y’ = f(t, y) 
in the differential equation that we are attempting to solve. Specifically, given 
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the approximation y, at time t,, the slope of the tangent line to the solution 
curve at this point is f(t, yn). Therefore, using this value for m in (6.4.6), letting 
h= At be the step size, we have 


Yn+1 = Yn t hf (ta, Yn) (6.4.7) 


An essentially identical approach will work for the system (6.4.5). In particular, 
given the initial condition (xo, yo) and a step-size h, we can generate the 
approximate solution (x(f,), y(t1)) © (1, y1) by taking 


x1 = xo +h- f(t, x0, yo) 


6.4.8 
Vi = yoth- g(t, x, Yo) ( ) 


The only difference between this approach and our experience with Euler’s 
method for a single equation is that we obviously have to update two 
approximations at once, as estimates of both x(t,) and y(t,) are needed to 
generate approximations of x(ty41) and y(ti41). We generalize our latest 
observation in (6.4.8) for a step from the approximation (xy, y,) to the 
approximation (%41, Vn41) by 


Xnt1 = Xn t+ h f(t Xn, Vn) 


6.4.9 
Yat. =IVnt+h- etn, Xn» Yn) ( ) 


At the end of this section, we will discuss the implementation of Euler’s 
method for systems in Excel. For now, we simply report the results of such 
an implementation here to see the approximations generated. For the original 
system we considered above, 


(6.4.10) 


recall that this system was also studied in example 6.3.2 in section 6.3. There 
we observed that the equilibrium solution (0, 9) is a stable center of the system 
and that we expect elliptical orbits nearby. If, for the IVP (6.4.10), we choose a 
step-size of h = 0.1 and take enough steps to complete the expected loop in the 
orbit, we see the abbreviated data in table 6.1. 

In particular, we notice that after taking a sufficient number of steps to 
loop back around to near the initial condition (1,8), we have in fact not 
returned to this point; in fact, we have missed it appreciably with the two nearest 
approximations being (0.527, 6.259) and (2.243, 6.312). 

If we decrease the step size h and take more steps, we can improve the 
accuracy of the approximation. Doing so with h = 0.01 results in the values in 
table 6.2. 

We see that the approximate trajectory has completed one full loop and has 
nearly returned to pass through the point (1, 8) where the trajectory began. This 
behavior is more consistent with what we expected based on the classification 
of the equilibrium point (0,9) as a stable center through linearization in the 
preceding section. 
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Table 6.1 


Euler’s method applied to (6.4.10) 


with step-size h = 0.1 


th Xn Yn 

0 1 8 

0.1 | 1.8 8.1 

0.2 | 2.529 8.28 

0.3 | 3.12516 8.5329 

2 —1.146540202 | 6.373703158 

2.1 | 0.527383445 6.259049138 

2.2 | 2.242958058 6.311787483 

2.3 | 3.93970067 6.536083289 
Table 6.2 


Euler’s method applied to (6.4.10) 
with step-size h= 0.01 


th Xn Yn 

0 1 8 

0.01 | 1.08 8.01 

0.02 | 1.159299 8.0208 

0.03 | 1.237838674 | 8.03239299 
2.09 | 0.934286677 | 7.878994865 
2.1 1.022610614 | 7.888337731 
2.11 | 1.110302289 | 7.898563837 
2.12 | 1.197299927 | 7.90966686 
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In the first example with Euler’s method we just completed, we observe one of 
the major weaknesses of the method: when a large number of steps are needed 
and some of the changes in x and y are large, a substantial amount of roundoff 
error enters the calculations. While more sophisticated numerical methods exist 
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(and are studied in chapter 7), for now we limit ourselves to Euler’s method in 
order to first get an intuitive feel for the numerical behavior of approximate 
solutions. Another example follows. 


Example 6.4.1 For the system of initial-value problems given by 


x(0) = 2 
y(0)=-1 


estimate the solution to the IVP up to tf = 5 using h= 0.1 and comment on the 
behavior of the trajectory. 


x =y—x’, 


=e (6.4.11) 


Solution. In the given problem, if we take the perspective that x’ = f(t, x, y) 
and y’ = g(t, x, y), then it follows that f(t, x, y) = y—x? and g(t, x, y)=x—y?. 
Applying (6.4.9) with h = 0.1, we have 

Xnt1 = Xn +0.1- (yn — x3) 

Yn) = Yn t+ 0-1 (%n— Yn) 


Beginning this iteration with x9 = 2 and yo = —1, we generate the following 
table. 
th Xn Yn 
0 2 =] 
0.1 | 1.1 —0.7 
0.2 | 0.8969 —0.5557 
0.3 | 0.769180708 | —0.448849846 
4.7 | 0.994536765 | 0.994533281 
4.8 | 0.995620126 | 0.995618024 
4.9 | 0.996490144 | 0.996488877 
5 0.997188297 | 0.997187534 


In the table, we see behavior consistent with the fact that the equilibrium point 
(1, 1) of the system is a stable attracting node. In addition, the numerical data is 
in agreement with the graphical behavior we expect based on the direction field 
in figure 6.4 where we first considered the given nonlinear system. This behavior 
is also seen in the following plot in figure 6.11, which shows the (x,, y,) data 
from n=0,...,50 generated by Excel. 
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Figure 6.11 The trajectory for the IVP (6.4.11) generated by Euler’s method 
with h = 0.1. 


Example 6.4.1 shows that when small changes in t lead to very small changes in 
x(t) and y(t), such as near a stable, attracting node, Euler’s method produces 
reasonable approximations without having to resort to extremely small h-values. 
We also see the importance of having a theoretical understanding of the 
expected behavior in advance of executing computations in order to check the 
reasonableness of our results. 


6.4.1 Implementing Euler’s method for 
systems in Excel 


Just as we did for single initial-value problems in section 2.6.1, we will use Excel 
to generate approximate solutions to system IVPs. In this setting, given an initial 
value problem 


x'=f(x,y,t), x(t) = xo 

y’ =g(x,y,t), y(t) = yo 
we seek approximations x1, %2,... and yj, y2,... such that (xy, Vy) © (x(ty), 
y(tn)), where ti41 = ti +h for some chosen step-size h. In particular, we 
have shown that these approximations are generated using Euler’s method 
by the rule 


(6.4.12) 


Xn4+1 = Xn+ h-f (th Xn» Vn) 
Ynt1 = Int+h-gltn, Xn» Yn) 


In a spreadsheet, we will view the following data: step number 1, stepsize h, ty, 
Xn» Vno f(tas Xn. Yn)» and g(ty, Xn, ¥n)s where t, is the value of the independent 
variable and (xy, Yn) © (x(tn), y(tn)) is an estimate to the solution to the IVP at 


(6.4.13) 
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the value t,. This data will appear in a given row where the row contains all these 
values for the corresponding n-value. From this, we naturally build subsequent 
approximations (xn+1, ¥n+1) based on the preceding row. 

We will demonstrate the development of such an Excel spreadsheet for the 
particular example 


x =y—-x, x(0)=2 


(6.4.14) 
y=x-y, y(0)=-1 
that we investigated in example 6.4.1. 
To begin, we establish names for the various columns, say in cells Al through 
G1, and see on our screen in Excel the information below. 


A B Cc D E F G 


1 n h ton x on yon f(x_n,y_n) |g(x_n,y_n) 


In most of the examples we consider with Euler’s method, the system will be 
autonomous (i.e., f is implicit in the functions f and g), and therefore we choose 
to omit t from the column labels for f (ty, xn, vn) and g(th, Xn, Yn). 

In the subsequent row 2, we now enter the given data at step zero. In 
particular, in cell A2 we enter the step number (“0”), in B2 the chosen stepsize 
(“0.1”), in C2 the starting t-value (“0”), in D2 the starting x-value (“2”), 
and in E2 the starting y-value (“-1”). Next, in F2, we apply the function 
f(t,x,y) to get the slope at the point at this step. That is, since in this IVP 
f(t, x,y) =y —x?, we enter in F2 the command “= E2 - D2%3”. Similarly, 
since g(t, x, y) =x—y°, in G2 weenter“= D2 - E2*3”.Nowourspreadsheet 
appears as follows. 


A Cc D E F G 
1 n h ton x_n yon f(x_n,y_n) | g(x _n,y_n) 
2 0 0.1 0 2 -1 -9 3 


In the next row, row 3, we may now build subsequent entries based on 
existing data. To increase the step number, in A3 we enter “= A2 + 1”. Since 
the step-size stays constant throughout, in B3 we input “= B2”. Since the next 
t-value will be the preceding t-value plus the stepsize (t; = f +h), we enter in 
C3 the command “= C2 + B2”, 

To compute the next x-value in cell D3 from Euler’s method, we know that 
x1 = xo + hf (to, x0, yo). Hence, in D3 we write “= D2 + B2*F2”. Similarly, 
to compute y, = yo + hg(to, Xo, Yo), in cell E3 we enter “= E2 + B2*G2”. 

Finally, we also need values of f(ti,21, 71) and g(ti,x1,¥1) for use in 
the following step. This involves simply updating the functions f(t, x, y) and 
g(t, x, y) at the given t-, x-, and y-values, so we select cell F2, copy it, and paste 
it into cell F3. Equivalently, we can directly enter in F3 “= E3 - D3%3”. 
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We can similarly copy G2 into G3, or in G3 enter “= D3 - E3%3”. Below is 
the current state of our spreadsheet. 


A B ¢ D E F G 
a n h rr x_n yon £(x__n,y__n) | g(x__n,y_n) 
2 0 0.1 10) 2 = -9 3 
3 Al ORD O.4. ula -0.7 -2.031 1.443 


Now we can harness the power of Excel to compute as many subsequent 
steps as we like. By using the mouse to highlight row 3, and then placing the 
cursor on the bottom right corner of cell E3, we can click and drag downward 
to fill subsequent rows with similar calculations. For example, doing so through 
row 7 yields the following. 


A Cc D E F G 
ab n ton Seon: yon f(x n,yvyon)| g(x _n,y nn) 
2 0 0.1 0 2 -1 -9 3 
3 uf 0.1 O.1 Ae. -0.7 -2.031 1.443 
4 2 0:4 0.2 0.8969 -0.5557 -1.2771929 1.0685015 
5 2 0:.1 0.3 0.7691807 | -0.4488498]| -0.9039271 0.8596087 
6 4 0.1 0.4 0.6787879 | -0.3628889]| -0.6756426 0.7265762 
7 5 0.1 0.5 0.6112237 | -0.2902313] -0.5185811 0.6356711 


As we have noted previously, besides the relative simplicity of these 
computations, there are further advantages Excel offers. One is that changing 
one appropriately chosen cell will update all of our computations. For example, 
if we are interested in the change induced by a different step-size, say h = 0.01, 
all we need to do is enter “0.01” in cell B2, and every other cell will update 
accordingly. In addition, if we desire to see the graphical results of our work, we 
can use Excel’s Chart Wizard. 

To plot the trajectory generated by our approximations, we can simultane- 
ously highlight the x and y columns in our chart above (cells C2 through C7 
and D2 through D7), and then go to Insert menu and select Chart (alternatively, 
we may click on the Chart Wizard icon on the toolbar). In the prompt window 
that arises, we choose “XY (Scatter)” and select one of the graph style options at 
the right. By clicking “Next” in a few subsequent windows (in which advanced 
users can avail themselves of more options), we eventually get to a final window 
where our graph appears and the option to “Finish.” Clicking on “Finish,” the 
graph will appear in the spreadsheet and may be moved around by clicking and 
dragging it accordingly. We see the resulting plot displayed as in figure 6.12. 


Exercises 6.4 
In exercises 1-7, use Euler’s method with the stated h-value to estimate the 
solution of the given system of IVPs at the given t-value. Compare your work to 
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Figure 6.12 An Excel plot of an approximate solution to the IVP (6.4.14). 


a plot of the direction field for the system and the classification of any relevant 
equilibrium solutions.° 


l. x’ = y—2xy, x(0) = 0.75 t=1,h=0.1 
y' = 4xy—x, y(0) = 0.5 

2. x =4-y’, x(0) = —2 t=1,h=0.05 
y =1-—x+4+y, y(0) =-1 

3. x’ = cosy, x(0) =2 t=l1,h=0.1 
y' = 1-sinx, y(0) = 3 

4. x'=2x-y, x(0) =1 t=1,h=0.1 
y' = —4x4+2y, y(0) = 1 

5.x =e’, x(0) = 0 t=1,h=0.05 
y =1/(+x?), y(0) =0 

6. x’ = In(2+ y), x(0) = —1 t=1,h=0.1 
y=xty, y(0) = —0.5 

7.x =y—x, x(0) = 1 t=1,h=0.05 
y =x-8y’, y(0) = 0.75 


3 In the exercises of section 6.2, equilibrium solutions were found and direction fields were plotted in 
exercises 1-7, which correspond to the same systems of differential equations given here. Similarly, in 
section 6.3, equilibrium solutions were classified through linearization in exercises 11-17, which also 
correspond to these systems. 
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8. Recall from section 6.1 that the nonlinear system of differential equations 
W' = —0.75W +0.25MW 
M' =0.5M —0.1MW 


models the numbers of wolves and moose (each measured in hundreds), 
in a predator-prey model where time is measured in years. Assume that at 
time t = 0 there are 250 moose and 550 wolves. Estimate the numbers of 
moose and wolves present at t = 3, 6, and 9 years using a step-size of (a) 
h=0.1, and (b) h= 0.01. Discuss your findings and describe the behavior 
of the trajectory.’ 


6.5 For further study 
6.5.1 The damped pendulum 


In our development of the pendulum equation, we learned that for a pendulum 
with an arm of length L and bob of mass m, the angle @ that the arm forms with 
the positive x-axis at time f satisfies the IVP 


Lo” =—gsin@, 6(0)=, 6'(0) = 6 (6.5.1) 


provided that we assume no friction is present in the screw from which the 
pendulum hangs and there is no air drag on the bob. Here, we investigate the 
effects of such resistance on the pendulum’s behavior. 


(a) Under the natural assumption that the friction or damping that is present 
is directly proportional to the velocity of the bob along the arc of motion, 
explain why it follows the pendulum is governed by the IVP 


Le” =—gsind — cd’, 6(0) =, 6'(0) = (6.5.2) 
where c is the damping constant. 


(b) Using the standard change of variables, convert the nonlinear 
second-order IVP (6.5.2) to a nonlinear system of first-order IVPs. Write 
the system in the form x’ = F(x) for an appropriate function F. 


(c) Determine all equilibrium solutions of the system in (b). Are the equilibria 
different from those of the undamped pendulum? 


(d) Let a given pendulum have an arm of length L = 1 m, and recall that 
g=9.8 m/sec”. For each of the c-values c = 0.5, c=1,c=2,andc=5, 
plot the direction field for the system in (b) as well as trajectories that 
correspond to the stated initial conditions below. For each plot, discuss the 


4 In the exercises of section 6.2, equilibrium solutions were found and the direction field was plotted 
for this system in exercise 8. 
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behavior of the pendulum over time and how damping affects the 
observed behavior. 


(i) A(0 =e 


(ii) 6(0) = 4, 6’(0) = 
(iii) (0 5 Q’ (EtG 
(iv) A(0) = 2, 6’(0) = —10 


In addition, be sure to discuss the physical interpretation of each set of 
initial conditions and how these conditions affect the trajectories. 


(e) Using c = 1, find the linear approximation of the system in (b) at two 
different equilibrium points, one that is stable and another that is 
unstable. Discuss the graphical behavior of the two linear systems you find 
near the equilibrium points and how this compares to the plot of the 
corresponding direction field in (d). 


(f) Again using c = 1 and L= 1, apply Euler’s method with h= 0.01 
to the system in (b) with the initial conditions (0) = 2, 0’(0) = 10. 
Experiment with how many steps are needed in order to have the 
approximations approach the stable equilibrium (27, 0), plot the 
approximations you compute, and compare the results to the appropriate 
direction field in (d). 


6.5.2 Competitive species 


In our development of the predator—prey equations, we used the fundamental 
assumption that the prey population would, in the absence of a predator, grow 
according to an exponential model, and similarly that the predator would decay 
exponentially if no prey is available. These hypotheses led us to equations of the 
form 

x! = ax — cxy 

y’ = —by + dxy 
where x is the prey population and y represents the number of predators. Recall 
that the terms —cxy and dxy represent a fraction of the number of predator-prey 
interactions that are, respectively, harmful or beneficial to the two species. 

In what follows, we consider a similar scenario where, instead of one 
species preying on the other, two species are competing for resources. In this 
setting, species interactions (modeled by “xy”) are harmful to both species. In 
addition, rather than assuming exponential growth or decay for the individual 
populations, we explore the affects of the assumption that each population on 
its own grows logistically. 


(6.5.3) 


(a) Assume that in the absence of another species competing for resources, the 
population x(t) grows according to the logistic model 


;j x 
x =ax(1-=) 
A 


For further study 419 


where a and A are positive constants (a is the population’s growth 
constant and A is its carrying capacity). Similarly, for a second population 
y(t), assume that without another competing species present y(t) is 
governed by the model 
=y(1-2) 
SPY. B 


where b and B are positive constants. 


By viewing a fraction of the interactions xy as harmful, we can subtract 
from each of the above differential equations a term proportional to xy — 
say axy from x’ and Bxy from y’ — to account for this competition. Do so, 
and show that the populations x(t) and y(t) satisfy the system of 
equations given by 


x’ =ax(1— ix “y) 


y/ = by(1— 5y— §x) 
(b) Throughout the remaining questions, we assume that x and y represent 
populations measured in thousands. We explore the impact of different 
constants in the equations, as well as various initial conditions. In (6.5.4), 
let a=0.5, b=0.25, A=5, B=2,a =0.04, and 6 = 0.02. Find all 
equilibrium points of the system. (Hint: there are more than two 
equilibria.) 


(6.5.4) 


(c) At each of the equilibrium points determined in (b), compute the 
linearization of the system (6.5.4), and hence determine the stability of the 
equilibrium point. 


(d) In an appropriate window, plot the direction field for the system (6.5.4) 
and discuss how the direction field supports your conclusions regarding 
the stability of various equilibrium points in (c). Discuss the long-term 
behavior of the two populations for several different initial conditions. 


(e) With the initial conditions x(0) = 2, y(0) = 2, use Euler’s method for 
systems to estimate the values of the populations at a range of time values. 
Use a step size of h = 0.1 and compare your results to the plot in (d). 


(f) In (6.5.4), use the parameter values given in (b), except change the carrying 
capacity of the second population to B = 15. Respond to prompts (b), (c), 
(d), and (e) for this scenario and compare and contrast the updated system 
with the first one considered. In the new situation, which population will 
dominate in the long run? Why do you think this is the case? 


(g) In (6.5.4), let a= 0.5, b= 0.25, A=5, and B = 2, but now adjust the 
parameters a and 6 to reflect greater competition for resources by setting 
a = 0.4, and 8 = 0.2. Respond to prompts (b), (c), (d), and (e) for this 
scenario and compare and contrast the updated system with the first one 
considered. In the new situation, which population is more likely to 
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dominate in the long run? For which initial conditions is the weaker 
population able to survive? 


(h) Suppose there are three different species x, y, and z, all competing for 
resources. Under the assumption that population interactions xy and xz 
are harmful to x, and so on, what system of differential equations models 
the behavior of the three species? 


7 


Numerical methods for differential equations 


7.1 Motivating problems 


In previous chapters, we have learned to solve a wide range of differential 
equations. Primarily, our focus has been on linear differential equations: first- 
order linear equations, higher order linear equations with constant coefficients, 
and systems of linear equations with constant coefficients. Indeed, we have 
learned through a variety of techniques that under the proviso that a differential 
equation or system is linear, we can almost always find a solution. 

The situation is much more complicated for nonlinear equations. For 
example, while we can use an integrating factor to solve the linear first-order 
differential equation y’ + y = t, if we replace y by y”, the differential equation 


yty=t (7.1.1) 


is no longer linear. In addition, (7.1.1) is not separable, nor is it exact. 
With none of our established analytical methods available, it appears that we 
cannot solve this differential equation. If faced with the related initial-value 
problem 


yty =t, y(0)=1 (7.1.2) 


we know that we can visually approximate a solution by plotting the direction 
field that corresponds to the differential equation. Moreover, we learned in 
section 2.6 that we can generate a sequence of estimates of the values of the 
solution y(t) at discrete t-values separated by a step-size h according to the rule 


to4.=teth and yney =n thf (th, yn), forn>0 (7.1.3) 
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The algorithm that generates this sequence of approximations is called Euler’s 
method. 

We encounter the same difficulties with higher order differential equations. 
While we can solve almost any higher order linear equation with constant 
coefficients, such as 


y+ ary’ + ay = f(t) 


nonlinear equations are much more difficult. For instance, as discussed in 
section 6.1, a simple pendulum may be modeled by the nonlinear second-order 
initial-value problem 


0” + 2 sind =0, 6(0)=, 0/(0)=% (7.1.4) 


where 0(t) is the angle the arm of the pendulum forms with a vertical axis at 
time t. In chapter 6, we introduced several different approaches to approximate 
the solution to (7.1.4); each was based on converting the second-order equation 
to a system of first-order equations and approximating the solution to the 
resulting system. 

Finally, nonlinear systems of differential equations are important in their 
own right. A prominent example is the predator-prey equations, discussed in 
detail in section 6.1, where two populations M(t) and W(t) (in hundreds) 
are modeled by the following system of nonlinear first-order initial-value 
problems: 


W’ = W(-0.75+0.25M), W(0)=3 
M’ = M(0.5—0.1W), M(0) =7 


As with the pendulum, the nonlinearity of these equations makes determining an 
analytical solution (i.e., formulas for W(t) and M(t)) impossible, and therefore 
we must instead be content to find approximate solutions. In section 6.4, we 
introduced an extension of Euler’s method that can be used to produce some 
basic approximations to the solution of a system of nonlinear initial-value 
problems such as (7.1.5). 

But through a variety of examples considered in sections 2.6 and 6.4, we 
have seen that Euler’s method has a big downside: each step produces significant 
error, and each step compounds the error from the preceding step. To get 
an accurate approximation using Euler’s method, a very small step-size h is 
usually needed. With modern computing power so readily available, we might be 
tempted to simply take very small h-values in this approach and be content to do 
thousands of computations to get estimates of solutions. But taking smaller and 
smaller values of h proves to be an unsatisfactory approach for many reasons, 
perhaps most significantly because of the fact that as numbers get extremely 
small, computers have great difficulty distinguishing them from zero and major 
round-off errors can result. 

Instead, we will seek to develop approaches in the spirit of Euler’s method, 
but more sophisticated in that they naturally reduce the error that comes from 
using a step of h = At. Our goal is to develop numerical methods for initial-value 


(7.1.5) 
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problems (for first-order, higher order, and systems) that, given a step-size h, 
produce an accurate approximate solution to the initial-value problem. We 
desire that the methods give reasonably good approximations for small (but 
not too small) values of h, while at the same time not requiring too many 
calculations. In the upcoming sections, we will discuss problems of the nature 
of (7.1.2), (7.1.4), (7.1.5), and more, and develop and apply algorithms that 
produce acceptable approximations to solutions. 


7.2 Beyond Euler's method 


To approach an initial-value problem that we cannot solve by standard 
techniques, such as separation of variables or integrating factors, we have learned 
that one option is to use Euler’s method. Given the IVP 


y¥ =flt.y), y(to) = Yo 


this algorithm generates a sequence of points (t), 7), (t,¥2)s --+ (ths Yn) 
according to the rule 


Ynt1=Vnthf (tii yn) forn>0 (7.2.1) 


where fy41 = t, +h. Each y, is an approximation to the value of the actual 
solution y at the value t,. That is, y(t,) © Yn. 

Euler’s method is developed by using the standard tangent line approxi- 
mation in calculus. While this is instructive and intuitive, the method is the 
least accurate of many other available methods. In this section, we begin to 
develop algorithms beyond Euler’s method in an effort to increase the accuracy 
of our approximations while actually decreasing the number of computations 
we execute. 

Before we develop new approaches, we first revisit some important concepts 
from numerical integration in calculus. These ideas not only remind us of key 
issues in approximation techniques, but also inform our efforts to approximate 
solutions to initial-value problems. Given a continuous function f(t) on an 
interval [f9, tf + h], there are several basic approximations to f ras f(t) dt. 
Specifically, 


Jit" ft) dt © he f(t) ( 

+h 64) dt h- f(t +h) ( 
pe" (0 ree a ead (trapezoid rule) 
fet" F(t) dt © h-f (+4) ( 


It is a standard exercise in calculus to show that the left and right endpoint rules 
are the least accurate approximations of the four, while the midpoint rule is the 
best. While one can make sophisticated arguments using Taylor series to justify 
claims about the size of the error in such an approximation, visual arguments are 


left endpoint rule) 
right endpoint rule) 


midpoint rule) 
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just as convincing: sampling f at the midpoint of the interval usually balances 
the behavior of the function and leads to the best approximation of the integral 
of the four options above. 

There is a direct link between the numerical approximation of definite 
integrals and numerical methods to estimate solutions to initial-value problems 
such as Euler’s method. Given the IVP 


y(Hn=f(ty),  y(o)=y 


if we integrate both sides of the differential equation with respect to t from t = fg 
to t= it) +h for some h > 0, then 


fo+h to+h 
i y'(t) dt =a f(t, y(t) dt (7.2.2) 
to 


0) 


Integrating the left side of (7.2.2), we have 


toth 
(G2 9G) / f(t, y(t) dt 
to 


or equivalently 


to+h) = yo) | ple. yt) (7.2.3) 
0 
Estimating the integral in (7.2.3) with the left endpoint rule, 
y(t + h) © y(t) + hf (to, y(t)) (7.2.4) 
Using the initial condition y(t) = yo, it follows that 
y(to +h) © yo + hf (to, yo) (7.2.5) 


which is precisely the first step in Euler’s method. That is, we have shown 
in our efforts to step from t = f& to t = +h along the solution y(t) that this 
process can be equivalently achieved by estimating the value of a definite integral. 
Moreover, Euler’s method can be viewed as arising naturally from estimating 
the required definite integral through a left endpoint rule. 

As such, it is not surprising that Euler’s method is not an accurate approach, 
for neither is the left endpoint rule for approximating integrals. The availability 
of the trapezoid and midpoint rules as better approximations leads us to consider 
two improvements upon Euler’s method. 


7.2.1 Heun’s method 


To improve on Euler’s method, we return to (7.2.3), and instead estimate 

the definite integral on the right-hand side with the trapezoid rule. Doing so, 

we find 

f (to, y(to)) + F(t + h, y(t + A) 
2 


y(to +h) © y(to) +h- (7.2.6) 
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The difficulty in (7.2.6) is that the last term in the approximation on the right- 
hand side involves y(t + h), the very quantity we are trying to estimate. One way 
to view what is occurring in this approach is that we are trying to use not only the 
slope at (f%, yp), computed as f(t, yo), but also the slope at (t +h, y(t + h)). 
While we do not know y(t + h) exactly, we can estimate this value using Euler’s 
method. In particular, if we use the fact that y(t) = yo and employ the Euler 
approximation y(t + h) © yo + Af (to, yo), then from (7.2.6) we find that 


f (to, Yo) + f(t + h, yo + hf (to, Yo)) 


y(t +h) © yo +h- 3 


(7.2.7) 


Generalizing (7.2.7) to the situation where we are moving from the known 
approximation y(t,) © y, at point (t,, y,) to a new approximation (t)41, V¥n+1) 
with ty41 = ty +h, we have developed Heun’s method given by 


f (ta, Vn) +f (tot, Yn + hf (tus Yn)) 


- (7.2.8) 


Ynt1 =In+h- 


Because this algorithm is more complicated than Euler’s method, some 
additional notation can assist us in its implementation. We first let 


an = f (tn, Yn) (7.2.9) 


which is the slope of the solution curve at (t,, y,) given by the IVP. We observe 
that the expression a, arises twice in (7.2.8), and that we also have to compute 
f (tnt. ¥n + han). We therefore let 


by = f (ta+1,¥n + han) (7.2.10) 


It follows that Heun’s method is then executed by computing 


an + by 
2 


Yatl =Ynth- (7.2.11) 


In this light, we see that Heun’s method uses the average of two slopes (the 
slope at (ty, Yn) and the approximate slope at (ty+1, ¥n41)) in order to predict 
the next value of the solution y(t). We consider an example to demonstrate 
how Heun’s method is implemented and to contrast its results with those from 
Euler’s method. 


Example 7.2.1 Execute ten steps of Heun’s method with h = 0.1 to find an 
approximate solution of the initial-value problem 

y =2t(2-y), y(0)=1 
Compare the results to Euler’s method as well as the exact solution of the IVP. 


Solution. Note first that the given differential equation is both linear and 
separable. The exact solution of the IVP is y(t) = 2 — en 
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To apply Heun’s method, we must compute a,, bj, and y, at each step. 
To begin, ao = f (to, yo). From the stated IVP, f(t, y) = 2t(2 — y) and (to, yo) = 
(0, 1). Thus, 


ay =2-0-(2—1)=0 
In addition, bo = f(t, yo + hag), so 
bp =2-0.1-(2—(1+0.1-0)) =0.2 


With both ap and bo calculated, we can now determine y; to be 


N=Yor (ao + bo) =1+ “*(0+0.2) = 1.01 
Repeating these same steps to determine y2, we find that 
a =f(t,y1) =f (0.1, 1.01) = 2-0.1-(2— 1.01) = 0.198 
and 
b} = f(t, 1 + hay) =f (0.2, 1.01 +0.1-0.198) 
=2-0.2-(2— 1.0298) = 0.38808 
so that 


0.1 
yr = yi + (ait bi) = 1.01 +0.05(0.198 + 0.38808) = 1.039304 


Implementing the remaining computations in a program such as Excel, it follows 
that we can generate the values shown in table 7.1. Included in the table are the 
approximations generated by Euler’s method, as well as the errors resulting from 
both methods which are computed by comparison to the exact solution of the 
IVP. For simplicity, we report the results from every other step in each algorithm. 


Table 7.1 
Euler’s method and Heun’s method applied to the IVP y’ = 2t(2—y), y(0) = 1, 
using h=0.1 


Euler Heun Solution Euler error Heun error 
th | Yn Yn y(tn) IW(tn)—Ynl | ly(tn) — ol 
0 1 1 1 0 0 
0.2 | 1.02 1.039304 1.039210561 | 0.019989439 | 0.000093439 
0.4 | 1.115648 1.147959794 | 1.147856211 | 0.038539949 | 0.000103583 
0.6 | 1.267756544 | 1.302226785 | 1.302323674 | 0.053302085 | 0.000096889 
0.8 | 1.445838152 | 1.472149858 | 1.472707576 | 0.061796472 | 0.000557718 
1 1.618293319 | 1.630946606 | 1.632120559 | 0.062514097 | 0.001173953 
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Obviously, Heun’s method is a major improvement over Euler’s method. In fact, 
given that we use the Euler approximation at each step to help forecast the next 
slope encountered, it is somewhat remarkable how accurate Heun’s method 
is. It can be shown rigorously that the error in Heun’s method is a significant 
improvement over Euler’s method by relating the error in the approximation 
to the step-size h; it turns out! that the error in Euler’s method is proportional 
to h*, while the error in Heun’s method is proportional to h?. Finally, we might 
observe that it appears unusual that the error in Heun’s method actually drops 
from t4 = 0.4 to t = 0.6, and that the growth in the error slows in Euler’s 
method at the same stage. This is due to the fact that the solution function 
y(t) =2-e7 isan increasing function whose concavity changes (from concave 
up to concave down) at the point t = 1/2; the change in concavity allows 
the linear approximations to temporarily catch up, instead of having the error 
continue to increase at an increasing rate. 

We have seen that Heun’s method is developed using an application of the 
trapezoid rule in numerical integration. We consider another similar method 
(based on the midpoint rule) before introducing more sophisticated techniques 
in section 7.3. 


7.2.2 Modified Euler’s method 


The midpoint rule is normally more accurate than the trapezoid rule.” Given 
our experience with Heun’s method and its connection to the trapezoid rule, it 
makes sense to see if we can develop a related method that uses the perspective 
of the midpoint rule. 

Recalling (7.2.3), 


toth 
ylto+h)=y0)+ f° frye) 
to 
if we use the midpoint rule to estimate the integral, then we have to evaluate the 
integrand at the midpoint f + h/2 of the interval [10, t) + h]. Doing so, 


ylio+ 1) 00+ Hf (m+ S.¥(0+5)) (7.2.12) 


As with Heun’s method, in the context of trying to solve the IVP y’ = f(t, y), 
y(to) = yo, only y(t) is known. Thus, we do not know—and therefore have to 
estimate—the value of y(t + h/2) in (7.2.12). We again employ Euler’s method 
and write 


h h 
»(+3) & y(t) + 5 flo, v(t) (7.2.13) 


1 A more formal analysis of errors that shows the dependence on powers of h is discussed in 
section 7.3. 

? On an interval where f(x) has consistent concavity, the midpoint rule is approximately twice as 
accurate as the trapezoid rule. 
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Substituting (7.2.13) in (7.2.12) and replacing y(t) with yo, 


h h 
V0 + 1) 0+ Wf to 90+ 53) (7.2.14) 


Generalizing (7.2.14) to the situation where we are moving from a known 
approximation y(t,) © y, at point (t,,y,) to the next approximation at 
(ti+1,¥n+1), we have developed the Modified Euler method given by 


h h 
Yes = Yet HE] tS Yat SF ln In) (7.2.15) 


As with Heun’s method, some additional notation assists us in tracking our 
computations. Let ay = f (tn, Yn) and 


h 
Cn = Yn + 5 an 
so that 


h 
Ynti = mnt bf (+ 5-6) (7.2.16) 


We consider an example in order to see the implementation of the Modified 
Euler method and to compare its results to those of Heun’s method. We again 
employ an IVP that we can solve exactly in order to compare the errors of the 
two methods. 


Example 7.2.2 Consider the initial-value problem y’ = e? — y, y(0) = 1. Apply 
the Modified Euler method to estimate the value of y(1) using h = 0.1 and 
compare the results with Heun’s method and the exact solution. 


Solution. Since y’ = e*' — y isa linear first-order differential equation, we can 
find the general solution y(t) = Ce~' + }e*", and hence the exact solution to 
the IVP is 
2 tla 
y(t)= sc rae 

To begin the Modified Euler method, we know from the given IVP that f(t, y) = 
e*' — y and that (f, yo) = (0,1). Thus, ay = f(t, yo) = e7° — 1 = 0. Next, 
we observe that co = yo + hag =1+40.05-0= 1. To compute y1, by (7.2.16) 
we have 


h 
v1 = yo + hf (10+ 5 «) =1+4+0.1-(exp2(0+ 0.05) — 1) 


=1+0.1-0.105170918 = 1.010517092 


Continuing to the next step, a, = f(t,y1) = exp(2-0.1) — 1.010517092 = 
0.210885666. Next, 


h 
a=yt+ 5a = 1+ 0.05 - 0.210885666 = 1.021061375 


Table 7.2 
Heun’s method and Modified Euler's method (ME) applied to the IVP y’ =e?“ — y, 


y(0)=1 with h=0.1 
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Heun ME Solution Heun error ME error 

th | Yn Yn y(tn) lV(tn) — Yn lV(tn) — Ynl 

0 1 1 1 0 0 

0.2 | 1.044572834 | 1.043396835 | 1.043095401 | 0.001477433 | 0.000301434 
0.4 | 1.192009094 | 1.189291538 | 1.188727007 | 0.003282087 | 0.000564531 
0.6 | 1478251184 | 1.473408204 | 1.472580065 | 0.005671119 | 0.000828139 
0.8 | 1.959569856 | 1.951698881 | 1.950563451 | 0.009006405 | 0.00113543 
1 2.722082435 | 2.70981115 2.70827166 0.013810775 | 0.001539489 

Finally, 


h 
yo=yi thf (1 ae a) = 1.010517092+0.1- (exp2(0.1+0.05) — 1.021061375) 


= 1.010517092+0.1-0.328797432 = 1.043396835 


Executing eight more steps using a computer, we find the results in table 7.2. 
We also show the results from Heun’s method in order to make a comparison 
between the two approaches we have developed beyond Euler’s method, again 
reporting the results from every other step. 


From the table, we see that the Modified Euler method is an improvement 
over Heun’s method. This is not too surprising since the former stems from 
the midpoint rule for integration, while the latter from the trapezoid rule. In 
addition, if we plot the exact solution function, we see that the solution is always 
increasing and concave up over the interval of interest; in the presence of such 
consistent concavity in the solution function, the midpoint rule will generate 
noticeably more accurate approximations than will the trapezoid rule. 


Obviously Heun’s method and the Modified Euler method are substantial 
improvements over the standard Euler’s method. Not only are their errors much 
smaller, but the errors grow less quickly. To better understand why this is so, 
observe that Euler’s method relies solely on presently available data in generating 
its estimates. That is, the method takes an approach that relies on just one data 
point in order to proceed to the next approximation. Our two newest methods 
instead look into the future: rather than using the current point and the slope 
at that location, they use the current point and an estimate of the slope at a 
point that is ahead of our current location. We create these estimates using only 
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the currently available data, but the approaches lead to a substantial increase 
in accuracy that makes us hopeful for significant improvements through other 
predictive approximation techniques that we are yet to investigate. 


Exercises 7.2 In exercises 1-10, use (a) Euler’s method, (b) Heun’s method, 
and (c) the Modified Euler method to estimate y(1) using h = 0.1, and compare 
the approximations generated by the three methods. In exercises 1-6, compare 
the approximations with the exact solution. 


1. y'+2ty=0, y(0) = —2 
2.y'=2y-1, y(0) =2 

3. y’-y=0, y(0) =2 
4.(7—-2y=0, (0) =2 
5.y—-y=1, y(0) =0 

6. tyy’=—1—y*, (0) =2 
Z.y'ty=t, — y(0)=1 

By +y=t y(0)=1 

9. y'+siny =2e%, y(0) =0 
10. y/ = 2e!/? sin /Y, y(0) =0 


7.3 Higher order methods 


In calculus, we learn that if F(x) is a function with n+ 1 derivatives in an interval 
surrounding a value x = a, then F has a Taylor polynomial expansion that obeys 
the relationship 


’ (n) 
F(x) =F(a)+F'(a)(x—a)+ - (e—aln t= 7. 
FOr) (g,) n+1 
“enn OO ™ 


which is valid for x-values in an interval surrounding a and ¢, isa number within 
that interval that depends on x. If we think of our interest in the solution y(t) of 
an initial-value problem, assuming that y is sufficiently differentiable, the Taylor 
series expansion of y provides insight into errors that arise in approximation 
schemes. In (7.3.1), if we replace F by y, a by fo, and x by & + h, noting that 
x—a=h, it follows that 


h h" 
y(t +h) = y(t) + hy'(t) + 5" (b) eee ae +O(h"t!) (7.3.2) 


where by “O(h"*1)” we mean “of order h"*! or “proportional to h"*!.” 
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From (7.3.2), we can discern the so-called truncation error of certain 

methods. For example, if we use the approximation 

y(to + h) © y(t) + hy'(to) (7.3.3) 
which corresponds to Euler’s method,’ we see that the truncation error is 
proportional to h? from the equation y(t +h) = y(t) + hy’(t)) + O(h’). We 
therefore say that Euler’s method is first-order, in reference to the highest power 
of h present in (7.3.3). 

Since we use a small step-size h, it is evident that higher order methods 
are superior: in the error due to truncation, higher powers of h will approach 
zero faster. In what follows, we will investigate second-, third-, and fourth- 
order approaches. The first two arise through using the Taylor series expansion 
directly, and are therefore called Taylor methods. 


7.3.1 Taylor methods 


To employ a second-order Taylor method, from (7.3.2) we must be able to 
compute 


h2 
y(t +h) © y(to) + hy’ (to) + 7 (io) (7.3.4) 


In a standard initial-value problem, we are given y’ = f(t, y) (plus an initial 
condition), so we can compute y” from the form of the differential equation. In 
particular, since 


y(th=f(t.y(t)) 


the chain rule for functions of two variables,* implies that 


” _ d 
yH= ql) 


d d 
= fit y) tht) UI 


=flt y+h(ty)y’ 
=fl~,n+h(on fy) (7.3.5) 
Combining (7.3.5) with (7.3.4), we have developed the second-order Taylor 
method given by 
y(to + h) © y(to) + hf (to, yo) + fit 9) + fy(to, yo) f (to, Yo)] (7.3.6) 
Generalizing (7.3.6) to the step from yy to ¥y41, we find that 


h2 
Ynt1 =Ynt hf (ths Yn) + > eltn, Yn) + fy (tn, Yn) Ff (ths ¥n)] (7.37) 


3 Observe that we are writing y’(f), which is given by f(t, yo) in Euler’s method. 
4 Weare using the rule that if f(x, y) is a differentiable function of x and y, and x and y are each 
differentiable functions of t, then d/dt[f (x, y)] = fx(x, y)dx/dt + f, (x, y)dy/dt. 
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where y;, © y(t,). We consider an example to demonstrate the implementation 
of this method and compare it to results previously considered. 


Example 7.3.1 Execute ten steps of the second-order Taylor series method 
with h = 0.1 to find an approximate solution of the initial-value problem 


yaet—y, y(0) = 1 
Compare the results to those of Heun’s method and to the exact solution. 


Solution. This is the same IVP that we considered in example 7.2.2 with 
Heun’s method and the Modified Euler method. To employ (7.3.7), we first must 
compute f;(t, y) and f,(t, y). Since f(t, y) = e*! — y, we know that f;(t, y) = 2e! 
and f,(t, y) = —1. In addition, to simplify the implementation of the method, 
we use notation similar to Heun’s method. We let ay = f (th, Yn), tn =f (th. Yn) 
and sy = fy(tn, Yn), So that 
h2 
Vn =Vnt hay + a a + spay] 
Beginning with f = 0 and yo = 1, observe that 
ay = f (0,1) =e*°-1=0 
ro = f:(0, 1) = 2e*° =2 
30 = fy(0, 1) =—-1 
We then have 


h2 
yi = yo t+ hag + [10 + 5040] 


0.12 
=1+0.1-0+ [2-1-0] 


= 1.01 
Similarly, we can compute 
a, = f (0.1, 1.01) =e”! — 1.01 = 0.211402758 
r = f,(0.1, 1.01) = 2e7°! = 2.442805516 
s| = f,(0.1, 1.01) =—1 
and thus 
h2 
y2=y1 + hay + 5 + s,ay] 
2 


0.1 
= 1.01+ 0.1-0.211402758 + ee — 1-0.211402758] 


= 1.04229729 


Continuing these computations through ten steps, we find the results noted in 
table 7.3, which are listed for every other step. Note, too, that we have included 


Higher order methods 433 
Table 7.3 
Taylor’s method and Heun’s method applied to the IVP y’=2t(2—y), y(0)=1 
using h=0.1 
Taylor Heun Solution Taylor error | Heun error 
th | Yn Yn y(tn) Iv(th)—Ynl | ly(tn) = Yn 
0 1 1 1 0 0 
0.2 | 1.04229729 1.044572834 | 1.043095401 | 0.000798112 | 0.001477433 
0.4 | 1.186750654 | 1.192009094 | 1.188727007 | 0.001976353 | 0.003282087 
0.6 | 1.468880073 | 1.478251184 | 1.472580065 | 0.003699992 | 0.005671119 
0.8 | 1.944339609 | 1.959569856 | 1.950563451 | 0.006223842 | 0.009006405 
1 2.698337638 | 2.722082435 | 2.70827166 0.009934023 | 0.013810775 


the results of Heun’s method from its application to the same IVP with the same 
step-size h= 0.1. 


From table 7.3, we can see that the errors in Heun’s method and the second- 
order Taylor method are roughly proportionate and seem to grow at the same 
rate. This suggests that Heun’s method may also be a second-order method—an 
assertion that may be proved by studying related higher order methods. In 
particular, Heun’s method can be viewed as one of a collection of algorithms 
known as Runge-Kutta methods, which we will consider after some additional 
work with Taylor methods. 

Having shown that we can use the Taylor series (7.3.2) to motivate the 
development of the second-order method (7.3.7), it is natural to wonder if we 
could extend this work further to a third-order method. This is desirable since 
if the error in our method is proportionate to h*, then the method will be more 
accurate without having to use smaller values of h. 

It is indeed possible to develop a third-order method, provided that the 
function f(t, y) from the given IVP is sufficiently differentiable. In particular, 
in order to write 


3 


I h 
y(to +h) © y(to) + hy'() + ZV" (to) + 57 (0) (7.3.8) 


we must compute the third derivative of y. From our earlier work (7.3.5), we 
know that 


y =fl_y +h v(t) (7.3.9) 
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Applying the chain rule to the first term in (7.3.9), along with the fact that 
y=flty), 


d d d 
a, bile] = felt. NFU t fylt, FU 


= f(t. y) +fy(t f(t, y) (7.3.10) 


where the final step follows from using y’ = f(t, y). Using both the product rule 
and the chain rule on the second term in (7.3.9) and suppressing the “(t, y)” 
argument of each function present, 


d d d 
GW] = Has ler a yh 
=fht+hAt+ fi tty ff 


=Hh+hef thf +hyf? (7.3.11) 


Combining (7.3.10) and (7.3.11) and using the fact that fry = fpr, we have shown 
that 


ies =futhyf +hh + hf A+Sef thf 
= fir + ff thA+ES thyf? (7.3.12) 


From (7.3.12), we understand why we normally do not use third-order Taylor 
methods in practice: the computations are extremely cumbersome. Were we to 
attempt to write 


h2 h 
y(to +h) © y(to) + hy'(t) + ZV (io) + 37 (e) 


in terms of the function f from the given IVP, we would have to compute 


he he 
y(t +h) © yo hf + Git Hh) + = Viet of + hh thf + hyf?) 


where each appearance of the function f or one of its partial derivatives is also 
being evaluated at the point (fo, yo). This combination of the determination 
of a large number of functions and the evaluation of each at every stage of an 
algorithm makes Taylor methods of orders higher than two unreasonable to 
use. Hence, we next introduce one of the most popular and effective numerical 
methods for the solution of IVPs (known as Runge-Kutta methods) that enable 
us to achieve higher order approximations without the difficulty of computing 
multiple partial derivatives and evaluating these functions repeatedly. 


7.3.2 Runge-Kutta methods 


Where higher order Taylor methods require finding partial derivatives of 
y’ = f(t, y) and evaluating these derivatives at each stage of the algorithm, 
Runge-Kutta methods seek to avoid using partial derivatives altogether, while 
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still achieving the desired higher order accuracy. Instead, in Runge-Kutta 
methods the function f is evaluated at a greater number of points, essentially 
seeking to compute the slope at the current and future points in an effort to 
make as accurate a prediction as possible. 

Formally, Runge-Kutta methods can be viewed as a generalization of Heun’s 
method. Recall that in Heun’s method we write 


h 
Yn) =Vnt 3 lant bn) 


where 


an =f (tn, Yn) and by = f (tr41,Vn + han) 

Rather than prescribing that we compute or estimate slopes at the points 
(ti, Yn) and (ty41,¥n41) and simply average them, a two-stage Runge-Kutta 
method takes an arbitrary combination of the function values f (ty, yn) and 
f(t tah, yn + Bhf (th, Yn)). Specifically, we set 

Yar = Yn t+ Ohf (tas Yn) + hf (th +oh, Yn + BAF (tn Yn)) (7.3.13) 
and then determine conditions on cj, c, a, and 6 that guarantee the 
approximation generated by (7.3.13) is second-order through a comparison to 
the Taylor expansion of y(t, +h). Itcan be shown that among the infinitely many 
possible valid choices for c), @, a, and 6, taking a = 6 = 1 and cy =m = 1/2 
results in Heun’s method, which justifies the fact that Heun’s method is 
second-order. 

Heun’s method is an example of a two-stage Runge-Kutta method; two- 
stage refers to the fact that slopes are evaluated or estimated at two points. It 
is possible to achieve even higher order Runge-Kutta methods by generalizing 
the idea in (7.3.13). In particular, we can take arbitrary combinations of the 
values (or estimated values) of f(t, y) at points in the interval t, < t < ty41 
and select the weights so that the approximation agrees with the Taylor series 
expansion for y(t, + h) up to, and including, the term involving h*, h°, or 
whatever accuracy we desire. The details of the rigorous development of such 
methods are complicated and unenlightening. But, a more intuitive approach 
can help us gain a better sense of why the Runge-Kutta method works so well 
and where the formulas used in the algorithm come from. 

If we recall our development of Heun’s method and the Modified Euler 
method, each was linked to the idea of numerically approximating a definite 
integral. Specifically, Heun’s method is analogous to the trapezoid rule, and the 
Modified Euler method corresponds to the midpoint rule. The trapezoid rule 
and midpoint rule both give the exact value of the definite integral of any linear 
function; in addition, when a function has consistent concavity over an interval, 
the midpoint rule is roughly twice as accurate as the trapezoid rule and the errors 
in the midpoint and trapezoid rules have opposite signs. As such, it makes sense 
to take a weighted average of the two rules in an effort to cancel out the error of 
each. Computing the weighted average 

2-MID-+ TRAP 
3 
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results in a new method known as Simpson’s rule that is a remarkably accurate 
approximation of the definite integral. In fact, it can be shown that Simpson’s 
rule is exact for every cubic polynomial. 

This same increase in accuracy can be accomplished through similar ideas 
in the numerical approximation of solutions to initial-value problems. Recalling 
our work with Heun’s method (H) and the Modified Euler method (ME), 


f (ths Yn) +f (trois Yn + hf (tn, Yn)) 
2 


HA: ynti = ¥n+ (7.3.14) 


h h 
ME: ynt1 = Yn + hf (‘nt Sant 5 -F(bn In) (7.3.15) 


we note that each uses a different expression for Ay, the approximate change in 
y(t) in moving from ty to ty+1. If we let 


h 
AyH = ZU (tn, yn) +f (to41, Yn+ hf (ta, yn))I 
and 
h h 
Ayme = hf tat 5.¥nt 5 f (tn Yn) 

then the analogy to Simpson’s Rule for approximating the solution y to the IVP 
y' =f (t, y), y(t) = yo is given by 

2Ayme + AyH 

3 


Using (7.3.14) and (7.3.15) and letting a, = f (ty, yn), we have the approximation 
rule given by yn41 = Yn + Ays where 


Ynt1 =Vat (7.3.16) 


2 h h lh 
Ays = gif (1+ pyar sn) + 3 : 5 lan +f (tnt 1 Yn t+ han) 


6 


If we slightly modify this expression for Ays in recognition of the fact that as 
we proceed across the interval we have more and more information available 
(and hence a better approximation of the slope to use), the fourth-order Runge— 
Kutta rule emerges. In particular, rather than rely on the value a, at every stage 
in (7.3.17), we recognize that we are attempting to compute approximate slopes 
at not just the left endpoint, but also at the midpoint and right endpoint. It makes 
sense that we should use these approximations as they become available to us; 
for instance, when we compute the approximate slope at the right endpoint, 
we ought to use the approximate slope at the midpoint to do so. Furthermore, 
given that the midpoint slope is weighted at 4 and the others at 1 in the average 
given by (7.3.17), it is reasonable to invest additional effort ensuring that the 
midpoint slope is as accurate as possible. 


h h h 
5, E + 4f (« ae aan oF sn) +f (tn41. In a han) (7.3.17) 
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As in Heun’s method, the computations are easier to understand, track, and 
implement if we introduce some additional notation. In particular, letting 

On = f (tr, Yn) slope at left endpoint 

by = f(tat sh, Yn+ $han) slope at midpoint 

Cn = f(tat +h, Ynt+ $hbn) updated slope at midpoint 

dn = f (th th. y+ hey) slope at right endpoint 
we can replace the expression 4f (ty + h/2, vy + h/2an) in (7.3.17) with the more 
accurate estimate 2b, + 2c, and replace f(ty41, ¥n + ha,) with f (thoi, Yn then); 
each of these updates takes advantage of the most recent calculation of the 


approximate slope at points nearby. We thus arrive at the fourth-order Runge— 
Kutta method by setting ¥y+1 = Yn + Ay to find 


(7.3.18) 


h 
Ynt+i =¥nt (an + 2bn + 2en + dn) (7.3.19) 


where dy, bn, Cy, and d, are defined as at (7.3.18). 

Again, through a lengthy development involving complicated calculations, 
it can be established rigorously that (7.3.19) is a fourth-order approximation 
technique: the resulting truncation error in the approximation is proportional 
to h°. The next example demonstrates the remarkable accuracy of the Runge— 
Kutta method. 


Example 7.3.2 Execute ten steps of the fourth-order Runge-Kutta method 
with h = 0.1 to find an approximate solution of the initial-value problem 


y= et _y, y(0) = 1 
Compare the results to those of the second-order Taylor method. 


Solution. This is the same IVP as we considered in example 7.3.1. Recall that 
the exact solution to the problem is y(t) = 2/3e~' + 1/3e”". 

To implement the Runge-Kutta method, we use f(t, y) = e?’ — y and 
compute ay, by, cy, and dy, as given by (7.3.18). Using the initial condition 
(t, Yo) = (0, 1), we compute 


ay = f(t, 0) =f (0, 1) =e?°-1=0 
=f (w+ yo ~) = f (0.05, 1+0.05-0) = f(0.05, 1) 


= e790 _ 1 —0.105170918 


h hb 
o=f (1 + 5100+ >) = f (0.05, 1 +0.05 - 0.105170918) 


= f (0.05, 1.005258546) = e” °° — 1,005258546 = 0.099912372 
do = f (ty, o + hep) = f (0.1, 1 +.0.1- 0.099912372) = f (0.1, 1.009991237) 
=e?! _ 1009991237 = 0.211411521 
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Table 7.4 
Fourth-order Runge-Kutta method and second-order Taylor’s method 
applied to the IVP y’=2t(2—y), y(0)=1 using h=0.1 


Runge-Kutta (RK) Solution RK error Taylor error 
th | Yn y(tn) ly(tn) — yal ly(tn) — Ynl 
0 1 1 0 0 
0.2 1.043096313 1.043095401 0.000000912 | 0.000798112 
0.4 1.188729047 1.188727007 | 0.000002040 | 0.001976353 
0.6 1.472583611 1.472580065 0.000003546 | 0.003699992 
0.8 1.950569107 1.950563451 0.000005656 | 0.006223842 
1 2.708280362 2.70827166 0.000008701 0.009934023 

and therefore 


h 
yi=yo+ G (a0 + 2bo + 2co + do) 


0.1 
= ras + 0.210341836 + 0.199824744 + 0.211411521) 


= 1.010359635 


Implementing these same calculations for subsequent steps, we can generate the 
output displayed in table 7.4, where again we report the results from every other 
step. The error from Taylor’s method is being reported from table 7.3. 


In table 7.4 we can see the exceptional accuracy of the fourth-order Runge-Kutta 
method. In one sense, this is not surprising. Being a fourth-order method, we 
expect the error in the first step to be proportional to h? = (0.1)? = 0.00001, 
which is in contrast to the second-order Taylor’s method with error proportional 
to h? = 0.001. In each method, the errors are in fact much smaller; one reason 
why this is so can be understood by thinking about the coefficient 1/5! = 1/120 
that arises in the Taylor expansion of y(t) + 4) and multiplies h>. 

What can be considered surprising about the Runge-Kutta method is 
that it generates such significant accuracy through a relatively limited number 
of computations and by only evaluating the function f(t, y) from the IVP 
at a select number of points, without the need to compute higher order 
derivatives. Fundamentally, the method takes four actual or approximate slopes 
and computes a weighted average of them in order to predict the next value 
of the solution function y(t). This fourth-order Runge-Kutta method is so 
accurate that it is used as the standard plotting tool in Maple when using 
the DEplot command. In addition, if we command Maple to produce a 
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numerical estimate to the solution of a stated IVP, the standard option in the 
dsolve command is a slightly more sophisticated algorithm known as the 
Runge—Kutta—Fehlberg method. 


Exercises 7.3 In exercises 1-10, use (a) the second-order Taylor’s method 
and (b) the fourth-order Runge-Kutta method to estimate y(1) using h = 0.1, 
and compare the approximations generated by the methods. In exercises 1-6, 
compare the approximations with the exact solution. Each IVP in exercises 1-10 
is identical to those in exercises 1-10 in section 7.2. 


ly +2ty=0, y(0)=-—2 
2.y'=2y-1, y(0)=2 
3.y'-y=0, y(0)=2 
(y')?—2y=0, y(0)=2 
y-y=1, y(0)=0 
.ty=-1-y’, y(0)=2 
yvt+ty=?, y(0)=1 
y¥ty=t, y0)=1 


oOo ON DW 


.y¥ +siny=2e', y(0)=0 
10. y’ =2e!/?sin /y, (0) =0 


7.4 Methods for systems and higher order 
equations 


In section 6.4, we introduced an extension of Euler’s method for estimating the 
solution to nonlinear IVPs such as 


x) =9y—y", x(0)=1 
vo y(0) =8 (7.4.1) 
We again choose to use the notation x = [x Fol rather than [x; 22]! because 
we will be using subscripts to label approximations to the component solutions 
x(t) and y(t): for instance, x1 © x(t), where t) = fo + h. Recalling that x and y 
are each implicit functions of t, we can view (7.4.1) in the form 


eS =S=f(ey.t), Hh) =x 


( 
yl =g(x.y.t), y(t) = yo (7.4.2) 


For a single initial-value problem y’ = f(t, y), y(0) = yo, we have developed a 
variety of methods for estimating the solution, including Euler’s method, Heun’s 
method, and Runge-Kutta, in order of increasing accuracy. We will generalize 
each of these methods to the situation for systems, leaving it as an exercise for 
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the reader to consider other alternatives, such as the Modified Euler method. 
Throughout, we keep in mind that for a single IVP, every method has the form 


Yn+1 =Vn+ Ay 


where Ay is an estimate that is obtained by taking the step-size h times some 
approximation of the slope of the solution y at or near (ty, y). 
Because Euler’s method is the simplest, we begin there. 


7.4.1 Euler’s method for systems 


Recall that for a single IVP y’ = f(t, y), y(0) = yo, Euler’s method is given by 
the algorithm 


Yat =Vnt hf (th. Vn) (7.4.3) 
where ty41 = ty +h, given a step-size h. As was shown in section 6.4, to 
implement Euler’s method for a system of two IVPs in the form (7.4.2), for 


the step from the approximation (x,, y;,) to the approximation (%,+1, ¥n41), We 
compute 


Xn4+1 = Xn+ h-f (ths Xn Vn) 


744 
Yat = Yat h- gtr, Xn, Yn) ie) 
Viewed from a vector perspective, if we let 
ee and F(t,x) = Flt, x,y) 
bg g(t, x,y) 
it follows that Euler’s method for systems is given by the rule 
x(ttl) x) 4 hF(tn, x(")) (7.4.5) 


We use the superscript x‘”) © x(t,) to denote the approximation since subscripts 
on vectors often indicate particular entries in the vector. 

In section 6.4, we saw evidence that Euler’s method is not very effective 
because of the errors that arise. To demonstrate this further, we consider an 
example involving a linear system whose solution we know exactly. 


Example 7.4.1 Use Euler’s method with h = 0.1 to estimate the solution x(1) 
to the initial-value problem 


EL J -o-[f] 


Compare the results to the exact solution. 


Solution. Using established methods from chapter 3, it is straightforward to 
show that the solution to the given IVP is 


x(t) =2e7" hea 


sin 2t 
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To estimate this solution via Euler’s method, we first observe that 


concn IEE] 


To compute x!) ~ x(t,), we use (7.4.5) and write 


(1) — y(0) (0)) _ | 2 —t 24) 2 
x? =x’ + hF(0,x =[5] +o4[— All 


=[o}+or] <a]=[-08] 
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Continuing Euler’s method in this manner for the subsequent nine steps with 
h=0.1 to estimate x(1), we find the results shown in table 7.5, where the values 


from every other step are reported. 


The final column in table 7.5 merits some discussion. Since our exact solution 
is a vector function and the approximate solutions are also vectors, the error 
at each stage is given by the vector ef”) = |x(tn) — x") |, where | - | denotes 
the absolute value function. The size of a vector can be measured by a single 
number, its length (or magnitude or norm), which is computed by taking the 
square root of the sum of the squares of its entries. For a vector x € R°, its length 


is ||x|| = (x? + eo + x), where x1, x2, and x3 are the entries in x. The entries in 
Table 7.5 
Euler's method applied to the IVP in example 7.4.1 using h=0.1 
Euler’s method Exact solution Euler error 
te. | x(tn) lIx(th) — x | 
2, 2 
0 | a | 0.000000000 
1.54 1.508201923 
ue | re Soe aes 
0.9266 0.934032947 
ve lore Rea a eat 
0.314314 0.397732304 
Ee Beta Piers 
—0.18542494 —0.026240382 
— —1.02741408 —0.898274743 Leas 
—0.512646273 —0.306183731 
. eee Eee lees 
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the final column in table 7.5 are computed by taking the length of the vector e'”) 
which is the difference between the exact solution and the approximate solution 
at step n. For example, the error that is present at the second step is 


ev —|f 154]_[ 1.5082]] _ | [0.03180 
~ || —0.72 —0.6376 ||| ~ ||| —0.08234 
= (0.03180)? + (—0.08234)? = 0.08827 


which is the second entry in the third column of table 7.5. 

Clearly, the errors in Euler’s method are significant. From our earlier work 
with Heun’s method and the Runge-Kutta method, we expect that we can 
attain much better approximations by using analogous approaches for systems. 
We consider Heun’s method next. 


7.4.2 Heun's method for systems 


From our most recent work, we know that if we view a system of IVPs from the 
perspective of vector functions, we are trying to estimate the solution to 


x =F(t,x), x() =x 
and that from this point of view, the vector version of Euler’s method is 
xt} — yl) hE (ty, x”) 


Recalling that Heun’s method for a single differential equation is given by the 
rule 


Yost = Jat Flay + bp) (7.4.6) 
where dy = f (tn, Yn) and bn = f (tn41, Yn + hay), we realize that the vector analog 
of (7.4.6) is 

x x0 4 Zeal) 4) (7.4.7) 
where a\”) and b'” are given by 
a”) = F(t,,x™) and b'” = F(ta41, x? + ha”) (7.4.8) 


In order to compare and contrast the vector version of Heun’s method 
with Euler’s method, we consider the following example which builds upon 
example 7.4.1. 


Example 7.4.2. Use Heun’s method with h = 0.1 to estimate the solution x(1) 
to the initial-value problem 


EL Je -- 


Compare the results to the exact solution and to those from Euler’s method in 
example 7.4.1. 


Methods for systems and higher order equations 443 


[6] 


Solution. We are considering the IVP 


v=re0=[2) 27][*]=[t2]. x0 =|”) 


ti 


To compute x x(0.1) by Heun’s method, we first compute 


—1 2 
20 =F(o,x) =| Te a | 


Peat Se) 
nee | Oo] | —4 
Next, to determine b“) we write 


b') = F(t, x + ha) = ie 5 (x) + hal) 


_f -1 2 ][2+0.1-(-2)]_ [2.6 
ve St 040. ay | =a 


Finally, we determine x) = x0) 4 h/2(a) +b) to find 


2-(3po (+1 22)-4 


Updating our work and computing the subsequent approximations results in 
the values for x, ..., x" shown in table 7.6, where we also display the errors 
computed in table 7.5 for Euler’s method applied to the same IVP. 


It is apparent from table 7.6 that just as Heun’s method for a single IVP is a 
substantial improvement over Euler’s method, it is also better for systems. At 
the same time, knowing that even higher order methods such as Runge-Kutta 
are available, we aspire to develop even more accurate methods for systems by 
converting the Runge-Kutta method for a single DE to one for systems. 


7.4.3 Runge-Kutta method for systems 


Recall that for the single first-order IVP y’ = f(t, y), y(t) = yo, the fourth-order 
Runge-Kutta method is given by 


h 
Ynt1 =Ynt+ im + 2b, + 2¢_+ dn) (7.4.9) 
where 
an = f(t Yn) 
bn =f (tat 3 a i 5 L han 
F a i ) (7.4.10) 
= T(t +5 th. Yn +5 +hbn) 
dn = f (tr +h, Yn + hen) 
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Table 7.6 
Heun’s method applied to the IVP in example 7.4.2 using h=0.1 
Heun Solution Heun error Euler error 
th | x” x(tn) Iix(th) —x |] | lIx(th) — x | 
2 
bs] [i 
1.50165 1.508201923 
0.2 | cg | | Ee 0.006567879 0.088268894 
0.924464441 0.934032947 
0.4 | _¢. neal LE psen ieee | 0.010734249 0.147271358 
0.389258164 0.397732304 
0.6 Ei oe Bete] 0.013157697 0.184285265 
—0.03046503 —0.026240382 
0.8 | ay sae eae 5 | 0.014336266 0.204979735 
—0.304699526 —0.306183731 
1 E _¢. poe Bees 0.014644143 0.213748529 


Just as with Euler’s method and Heun’s method, we can develop the vector 
analog of the Runge-Kutta method. We do so by letting 


xD) — xl) : (a + 2b +4 26 +a) (7.4.11) 
where 
= F(t x) 
F (tn +5 Lh »x) + Thal”) 
s (7.4.12) 
ne hae + 5h, x” + 5 hb”) 


d = F(t, +h, x + he”) 


The computations for the Runge-Kutta method for systems can be implemented 
in a way very similar to those for Heun’s method. Doing so and applying the 
Runge-Kutta method to the IVP stated in examples 7.4.1 and 7.4.2 results in 
the values shown in table 7.7; we also display the error from Heun’s method by 
way of contrast. 

As with single IVPs, the results of the Runge-Kutta method for systems 
are impressive. This is again due to the fact that the Runge-Kutta method is 
fourth-order, while Heun’s method is only second-order. 

We close this section by recalling the important link between higher order 
differential equations and systems of first-order equations. 
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Table 7.7 
Runge-Kutta method applied to the IVP in example 7.4.2 using h=0.1 


RK Solution RK error Heun error 
th | xi") x(tn) IIx(to)— x |] | a(t) — 
‘i ia | 
on | [_hsoezinsst) | 1508201923) | goooorsss. | onesere7 
0.4 | Pca | | aecieeel 0.00002714 | 0.010734249 
as | | _939772szesy] | [0397732304 | ooggosssg_| a013157057 
0.8 | pared | ie reece 0.00003639 | 0.014336266 
1 | ee | Poe : 0.00003724 | 0.014644143 


7.4.4 Methods for higher order IVPs 


We have repeatedly used the fact that any linear nth-order differential equation 
can be converted to a system of linear first-order equations. For example, given 
a second-order equation such as y” + 2y’ — 3y = sint, we know that with the 
substitution x; = y, x, = y’, it follows that x = [x x]? is a solution to the 
system of differential equations 


/ 
x) 2. 
/ 


% 


Given our current interest in approximating solutions to initial-value problems, 
we are particularly focused on nonlinear equations, including 


"+ = sind =0, 0(0)=a, 6'(0)=b 


x 
3x, — 2x. +sint 


which governs the motion of a simple undamped pendulum, as developed in 
section 6.1. In this setting, we are unable to determine an exact solution, and 
thus wish to generate an approximate one. More generally, we want to be able 
to develop an approximate solution to any nonlinear IVP. In the second-order 
case, we can view this problem as having the form 


y=f(t.y.y), yO)=a, y'(0)=b (7.4.13) 


446 Numerical methods for differential equations 


We introduce the substitution z = y’, then z’= y” = f(t, y, y’) =f(t, y, z), so 
that (7.4.13) may be rewritten as the system of IVPs 
y =2, y(0) =a 
z=f(t,y,z), z(0)=b 
Letting x = [y z]' and F(t, x) = [z f(t.y, z)|!, we may rewrite (7.4.14) in the 
form 


(7.4.14) 


x’=F(t,x), x(0)= H 


which is precisely the form we considered for Euler’s method, Heun’s method, 
and the Runge-Kutta method for systems. That is, once we have converted a 
higher order IVP to a system of first-order IVPs, we may choose from any of 
our existing approximation methods for systems of DEs. We demonstrate this 
for a particular example using Heun’s method. 


Example 7.4.3 Use Heun’s method to estimate the solution y(t) from t = 0 
to t = 1 to the second-order IVP 


y"+0.1y +4siny =0, y(0)=1, y/(0) =0 
with step-size h = 0.1. 


Solution. We begin by letting z = y’, so that z’ = y” = —4siny —0.1y/ = 
—4siny —0.1z. Writing x = [y z]", it follows that 


r z _ 
= Pe. — nial =F) 
Recalling Heun’s method, we must compute 
lt) yl) Hea 4 pl) 


where 


a") = F(t), x) and b™ = F(tay1,x + ha‘) 
With the initial condition x = [1 0], we first find that 


4 = 0 = 0 
~ | —4sin(1)—0.1-0 —3.366 


from which it follows that 


bO = F(0.1,x + ha) = 


—0.3366 
—3.332 


Therefore, x") is given by 


tase : (a +b) 
1], 01/f 0 0.3366 
= Bl ay (| eel +| —3.332 )) 


_ [0.98317 
~ | —0.33490 
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Table 7.8 
Heun’s method applied to the second-order IVP in example 7.4.3 using h=0.1 
n |x) al”) b” x(N+1) 
0 —0.336588394 0.98317058 
—3. en —3.3322251 —0.334905452 
2 0.933202302 —0.659006349 —0.973828394 0.851560565 
—0.659006349 —3.148220455 —2.952961961 —0.96406547 
4 0.740589862 —1.240510452 —1.497994703 0.603664604 
—1.240510452 —2.574842511 —2.163059344 —1.477405545 
6 0.445309489 —1.663161107 —1.818824379 0.271210214 
—1.663161107 —1.556632719 —0.919669919 —1.786976239 
8 0.088048126 —1.840715689 —1.857482294 —0.096861773 
—1.840715689 —0.167666053 0.569252015 —1.820636391 
10 —0.276080886 —1.728307853 —1.601989955 —0.442595776 
—1.728307853 1.263178986 1.896140173 —1.570341895 


Executing similar computations for the remaining nine steps to approximate 
x(1), we find the results shown in table 7.8. 


From the results of table 7.8, we see that 


x(1) #09 = oe | 


1.728307853 


Recalling that x(t) = [y(t) z(t)]' and that our ultimate goal is to estimate the 
solution y(t) to the stated IVP, it follows that y(1) + —0.2761. 


The approach in example 7.4.3 can be implemented for higher order initial- 
value problems through a substitution to convert a given higher order equation 
to a system of first-order ones. More accurate results may be obtained 
through applying the fourth-order Runge-Kutta method for systems. We note 
particularly that not only can we estimate solutions to nonlinear equations, 
but even those with non-constant coefficients. For example, solutions to 


IVPs like 
y" +ty=10sin2t, y(0)=y'(0)=0 


can now be approximated. 


Exercises 7.4 Inexercises 1-6, (a) use Euler’s method for systems with h = 0.1 
to estimate the solution x(1) to the initial-value problem, (b) use Heun’s method 


448 Numerical methods for differential equations 


for systems with h = 0.1 to estimate the solution x(1) to the initial-value 
problem, and (c) if possible, compare the results to the exact solution. 


or 


ox=| | [ J=+[i} x0=|5| 


In exercises 7-13, (a) use Heun’s method and (b) use the Runge-Kutta method 
to estimate the solution of the system of IVPs at the given t-value using the 
stated h-value. 


7. x! =y—2xy, x(0) = 0.75 t=1,h=0.1 
y =4Axy—x, y(0)=0.5 


8. x) =4-y’, x(0) = —2 t=3,h=0.05 
y =l1-xt+y, y(0)=-1 

9. x’ = cosy, x(0) = 2 t=1.5,h=0.1 
y'=1-sinx, y(0) =3 

10. x’ = 2x-y, x(0) =1 t=1.5,h=0.1 
y = —4x+2y, y(0) =1 

ll. x’ = e7’, x(0) = 0 t=2,h=0.05 
y =1/(1+x?), y(0) =0 

12. x’ =In(2+y), x(0) =—-1 t=2,h=0.1 
y=xty, y(0) = —0.5 

13. x/=y—x?, x(0)=1 t=1,h=0.05 
y =x—8y", y(0) = 0.75 


14. Recall from section 6.1 that the nonlinear system of differential equations 
W’ =—0.75W +0.25MW 
M’ =0.5M —0.1MW 


For further study 449 


models the numbers of wolves and moose (each measured in hundreds) in 
a predator—prey model, where time is measured in years. Assume that at 
time t = 0 there are 250 moose and 550 wolves present. Estimate the 
numbers of moose and wolves present at t = 3, 6, and 9 years using a 
step-size of (a) h= 0.1, and (b) h = 0.05 with both Euler’s method and 
Heun’s method. 


In exercises 15—18, (a) convert the given second-order IVP to a system of first- 
order IVPs, (b) use Euler’s method for systems with h = 0.1 to estimate the 
solution y(1) to the initial-value problem, (c) use Heun’s method for systems 
with h = 0.1 to estimate the solution y(1) to the initial-value problem, and (d) if 
possible, compare the results to the exact solution. 


15. y/+l6oy=2t+1, y(0)=y/(0)=0 

16. y’+16y=2sin2t, y(0)=y'(0)=0 

17. y+ 16y? =2sin2t, y(0)=y'(0) =0 

18. y" +0.2(y')? +2y?=4e'sint, (0) =y'(0) =0 


7.5 For further study 
7.5.1 Predator—prey equations 


Recall that a predator—prey scenario is modeled by the equations 


x= 0.6x—0.3xy x(0)=2 


y' =-0.9x+0.6xy (0) =3 (7.5.1) 


(a) Determine the nontrivial equilibrium solution of (7.5.1) and use a 
computer algebra system to plot the direction field of the system in a 
suitable window containing the equilibrium solution and the given initial 
condition. 


(b) Use a computer to implement Heun’s method to estimate the solution 
(x(t), y(t)) of (7.5.1) on the interval 0 < t < 20 using h= 0.1. 


(c) Use your data from (b) to generate two plots: one a parametric plot of the 
approximate curve (x(t), y(t)) and the other a simultaneous plot of the 
separate functions x(t) and y(t) on the same coordinate axes. Discuss the 
behavior of the populations x(t) and y(t) over time. 


(d) Modify your calculations in (b) appropriately to investigate the impact of 
changing the parameter ‘0.3’ in the first equation to each of the values 0.1, 
0.2, 0.4, 0.5, and 0.9. In each case, generate the same plots as instructed in 
(c). What impact does this have on the behavior of the populations? 


(e) Modify your calculations in (b) in order to consider the following different 
initial conditions: x(0) = 1.7, y(0) = 1.8; x(0) = 2.5, y(0) = 3.6; x(0) =5, 
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y(0) = 1. In each case, generate the same plots as instructed in (c). What 
impact do the initial conditions have on the behavior of the populations? 


7.5.2 Competitive species 


In section 6.5.2, we developed the model 
1 
x= ax(1—4x-£ ) 


A 
are - £x) (7.5.2) 


where a, A, and @ are positive constants (a is the population x(t)’s growth 
constant, A its carrying capacity, and w a parameter that reflects the competition 
for resources from population y(t)). The constants b, B, and 6 play the same 
roles for the second population. 


(a) In (7.5.2), leta=0.5, b= 0.25, A=5, B=2,a=0.04, and B = 0.02. 
Find all equilibrium points of the system and plot a direction field in a 
computer algebra system of this system that contains all the equilibrium 
solutions. 


(b) Apply Heun’s method to estimate the solution (x(t), y(t)) of (7.5.2) on 
the interval 0 < t < 20 using h = 0.1. Plot the trajectory of the 
approximate solution. 


(c) Leaving all other parameters the same, change the value of B to B= 8. 
Repeat questions (a) and (b) and discuss the differences between the 
results for the two B-values. 


(d) Repeat question (c) with B = 15. 


(e) What is the largest value of B for which the two populations can coexist 
with a stable equilibrium in which each population tends to a nonzero 
value as t > 00? What value(s) of B ensure that population y(t) will 
dominate as t > oo and force x(t) > 0? 


(f) For each of the three values of B above, experiment with the impact of the 
following different sets of initial conditions: x(0) = 1, y(0) = 1; x(0) =5, 
y(0) = 1; x(0) = 1, y(0) = 5; x(0) = 5, y(0) = 5. How do the different 
initial conditions impact the behaviors of the two populations? 


7.5.3 The damped pendulum 


In section 6.5.1, it was shown that for a pendulum with an arm of length L, bob 
of mass m, and damping constant c, the angle that the arm forms with the 
vertical axis at time t satisfies the IVP 


Lo” =—gsin@ — c6’, 6(0)=6, 6'(0) = % (7.5.3) 
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(a) Using the change of variables x = 6, y = x’, show that the nonlinear 
second-order IVP (6.5.2) is equivalent to the system 


x =y 
ae (7.5.4) 
L L 
(b) Apply Heun’s method to estimate the solution (x(t), y(t)) of (7.5.4) with 
g = 9.8, L=1, and c = 1 with initial conditions x(0) = 2, y(0) = 2 on the 


interval 0 < t < 10 using h = 0.1. Plot the trajectory of the approximate 
solution. 


(c) Repeat question (b) using c = 0.1 and c = 5. Discuss the differences in the 
results. 


(d) Investigate the effects of changing the initial conditions to the following: 
x(0) = 2, y(0) = 5; x(0) = 2, y(0) = 15; x(0) = 2, y(0) = —5. Do so for 
each of the three c-values noted above and discuss the differences among 


the results and the physical interpretation that explains how the pendulum 
is behaving. 
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Series solutions for differential equations 


8.1 Motivating problems 


In more sophisticated courses in mathematical physics or special functions, a 
different type of linear differential equation frequently arises from those we 
have studied to date. From several perspectives, we have thoroughly analyzed 
the behavior of linear differential equations with constant coefficients of the 
form 
y+ ay’ + aoy = f(t) 

But there are other important and well-known equations with non-constant 
coefficients. We list some of these here in anticipation of more in-depth study 
in subsequent sections. 

Airy’s equation is a linear second-order equation that arises in physics in the 
study of light refraction. While it can be stated in a slightly more general form, 
a good example to begin with is 


y"+ty=0 (8.1.1) 


The explicit presence of the coefficient “t” in (8.1.1) makes this equation 
substantially different from those (such as y” + y = 0) we have already solved. 
If we recall the initial approach to solving y” + y = 0, we can gain intuition 
for how to proceed with (8.1.1). We know that guessing y = e” in y’ + y=0 
leads to the characteristic equation r? + 1 = 0, so that y = e” or y= e~". We 
then know from Euler’s formula that both y = sin t and y = cos t arise as linearly 
independent solutions to y” + y = 0. One key characteristic the exponential, sine, 
and cosine functions have in common is that they can be expressed as infinite 
power series; indeed, this fact was used to justify the validity of Euler’s formula. 
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In particular, we can write 
2 3 n 


oa are f 8.1.2 
e= ti Oat kee (8.1.2) 
p Pr ey pent 
rr ak aes aes es eS 8.1.3 
sin 7 + a +(-1) CAED + (8.1.3) 
t? t4 pen 
—. —— — — eee — n eee 
cost=1 m1 +7 +(-1) Onl + (8.1.4) 


Each of these expressions for e‘, sin t, and cost is of the form )°°° 4 ant” and is 
valid for every real number ft. 

In the upcoming chapter, rather than making guesses of the form y = e”, 
we instead assume much more generally that y is a nice enough function to have 
a power series expansion of the form y = )-°°.y ant”, and then substitute this 
form of the potential solution function y into the differential equation in order 
to deduce the coefficients ay. 

Other well-known differential equations that we will consider include the 
Hermite equation 


y" — 2ty' + 2qv =0 (8.1.5) 
where q is a constant, the Laguerre equation 
ty’ +(1—-t)y’+qy=0 (8.1.6) 
(again where q is constant), and the Bessel equation 
Py" +t +(?—n")y=0 (8.1.7) 


where n is a constant. 

Again, in each of (8.1.5), (8.1.6), and (8.1.7), it is the presence of non- 
constant coefficient(s) involving t that makes us seek new ways to find solutions. 
Finally, recalling an elementary differential equation from calculus further 
motivates the importance of infinite series representations of functions. Among 
the simplest of all first-order differential equations are those of the form 
y’ = f(t); these can be solved (in theory) by integrating. But if we consider 
an example such as 

y= ad 

we are immediately stuck since the function e-® lacks an elementary anti- 
derivative. 

If we use (8.1.2) and replace t with —1?, then we can write 
t4 6 ; t2n 


pagers? me ee es a 
Ee a ey ai ae 
Integrating, it follows that 
2 6 t7 p2ntl 
=C+t _ weet (—p)ttl__ ee. 
PSOUE= Ss Mean par eS Gag 


Hence we are able to determine the general solution function y, although we 
must be content to leave y in its series representation. Discovering solutions in 
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this power series form will be typical of the results we obtain in our work in this 
chapter. 


8.2 A review of Taylor and power series 


From calculus, we know that if a function has a derivative at a given point t = a, 
then the function is approximately linear near t = a. Indeed, the existence of 
the first derivative ensures that the function is smooth: the function must be 
continuous at a and it’s graph cannot have a corner there. Of course, if having 
one derivative is a good thing, having several derivatives is even better. The 
best possible scenario of all is that the function is infinitely differentiable at 
t = a. That is, f(a) exists for every k= 0, 1, 2,.... A function that is infinitely 
differentiable at t = a and at all points in some small open interval containing a 
is said to be analytic! at t = a. If a function fails to be analytic at a given point, 
we say that f is singular at that point. For example, the rational function 


is 
IO= By yG—H 


is singular at t = 4 and t = +33 since it is undefined at these values (as are each 
of its derivatives). At every other value of t, f(t) is analytic. 

Much of the theory of analytic functions is a natural extension of the ideas 
of Taylor polynomials and Taylor series from calculus. Here our intention is 
not to develop a complete theory of analytic functions, but rather to remind the 
reader of important results on Taylor series and extend this perspective slightly 
in order to suit our purposes. Most results will be stated without proof. 

To begin, we assume that f is an analytic function at a = 0 and recall that 
the polynomial functions 


Po(t) = f (0) 
P\(t)=f(0)+f'(0)t 
f") 


Po(t) =f (0) +f (O)t+ 5 


vt (k) 
Pel) =f) +f +E? 4. LOO 


are called the Taylor polynomials of f at a = 0 and form the sequence of partial 
sums of the infinite series 


(8.2.1) 


f(0) 
at eee (8.2.2) 


Pay =fO+f OO 24+ 


1 Usually when analytic functions are discussed, we allow the function to have complex inputs and 
consider a disk of a given radius around a complex point. For our purposes, a discussion restricted to 
real values is sufficient. 
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In particular, the function P;(f) in (8.2.1) is the kth Taylor polynomial of f at 
a = 0, and the infinite series (8.2.2) is called the Taylor series of f centered at 
a = 0; the series converges in (8.2.2) if and only if the sequence of partial sums 
converges. That is, P(t) is defined if and only if 


lim P;(t) 
k->0o 


exists. If this limit fails to exist, we say that the Taylor series diverges at this point. 
What is perhaps most remarkable is the fact that wherever the series (8.2.2) 
converges, it does so to the value of the given analytic function f; moreover, 
the Taylor series converges in an interval centered at t = 0 that extends to the 
nearest singular point. Formally, we have the following theorem. 


Theorem 8.2.1 Suppose that f(t) is an analytic function at 0 and R is the 
distance from 0 to the nearest singular point of f(t). Then the Taylor series of 
f(t) centered at t = 0 converges to f(t) in the interval |t| < R and diverges in 
the interval |t| > R. 


The number R is called the radius of convergence of the Taylor series. We note, 
too, that it is possible for singular points to be complex, so R is not necessarily 
the distance from 0 to the nearest real singular point. We also observe specifically 
that for any t such that |t| < R, we know 


" (k) 


We consider an example to see many of these ideas at aod 


Example 8.2.1 Find the Taylor series of f(t) = In(1 + t) centered at t = 0 and 
determine the radius of convergence of the series. 


Solution. We begin by taking the first several derivatives of f and evaluating 
them at 0: 


f(t) =In(. +4) f(0) = In(1) =0 
fi) = +0) f'(0) = 
f(t) = (-Y0 +1)? f"O)= 
fF") = (-2)(-)+ 4)? f"(0) = 
fOM =(-3I)C-2Y-YA+1* FOO = 
From these calculations, we see that the fourth a polynomial is 
Pa(t )=0+t- 5 ee a 


1 1 
2 3 4 
ee Pa per 
2 3 4 
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The established pattern implies that the Taylor series of f(t) = In(1 + f) is 


payer t rate lay. cep 
2 3 4 = n 


From calculus, the standard way to test a power series for convergence is to use 
the Ratio Test. Doing so here with a, = (—1)"*1(1/n)t", we observe that 


| Ant Uda 
lim = lim 
n>0oo| dy noo (-1)"+1(1/n) t" 
nN 
Aisi -1. . 
n—>0o n+1 
= |t| 


The Ratio Test states that a given series converges if limy—oo |an41/an| < 1. 
Thus, if |t| < 1, it follows that 


il 1 1 1 
iInfleHj=a¢=—-P4—P s—f4ece air 8.2.3 
n(1+f) shat =P a=) ee (8.2.3) 


converges. 


The result of example 8.2.1 makes further sense in light of theorem 8.2.1 since 
we know that f (t) = In(1 +f) has a singularity at t = —1. If we substitute t = —1 
in (8.2.3), the opposite of the harmonic series arises (—1— 5 = i = - —---), which 
diverges. However, it can be shown by the alternating series test that (8.2.3) 
does converge when t = 1; indeed, for any power series that converges for 
|t| < R, it is possible for the series to converge at both t = +R, neither, or just 
one of the points. While this is an interesting mathematical topic in its own 
right, it is largely irrelevant in our discussion of series solutions to differential 
equations. 

We next state several prominent Taylor series expansions along with their 
respective radii of convergence and leave the development and testing of these 


series for convergence to the exercises at the end of this section. 


; t2 t" 
e = Leg a ee ar R=o 
P Lr a penrl 
1 ee ee 1. yo R= 
a ats to)" Grew = 
(8.2.4) 
2 t4 pen 
— —_— — — — soe ee n = 
cost =1 Tea +(-1) Qmit R=0 
1 2 3 n 
— =14+t4+ttrt+-:--+t'+-::- R=!1 


1-t 
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From these fundamental Taylor series, the series expansions of other related 
functions may often be easily found. The following example demonstrates one 
way in which this may be accomplished. 


Example 8.2.2 Find the Taylor series expansion of 


t 
t)= ——~, 
aa wre 
as well as its radius of convergence. 


Solution. If we first omit the t in the numerator of f(t), we can use the final 
result from (8.2.4) and substitute —4? for tf, writing 


1 


a a 2 — 2y2 — 2y3 see —_— 2\n eee 
fig) Af) (47? ($47) pe 4) 4+ 


= {47 + 16 —G4r 4s (Oe (8.2.5) 


To get the Taylor series of f(t), we now multiply both sides of (8.2.5) by f, 
and have 


I) = aap 


Since the original series from (8.2.4) converges for |t| < 1 and we replaced t 
with —4??, it follows that (8.2.5) converges for | — 4t?| < 1, or in other words 
for |t| < 1/2. Multiplying (8.2.5) by t has no effect on the radius of convergence 
of the series, and therefore (8.2.6) converges for |t| < 1/2. Note further that the 
denominator 1 + 41? of f(t) is zero at t = +i/2; each of these complex numbers 
lies a distance of 1/2 unit away from the origin and is a singular point of f. This 
observation is additional evidence that R = 1/2 is the radius of convergence of 
the series expansion of f(t). 


afar 16 4647 feet Haye sas 3826) 


Similar reasoning may be used to find expansions for such functions as aa 
tsin4t, and (cost — 1)/t?. In each case, the approach of example 8.2.2 is 
far simpler than using the definition of Taylor series directly and computing 
derivatives of the given function. 

One reason why the development of Taylor series for functions similar 
to those in (8.2.4) is so straightforward is the fact that Taylor series are 
unique. Said differently, if we can find a power series expression for a given 
function, it must be the Taylor series. This is stated formally in the following 
theorem. 


Theorem 8.2.2 The series ys by t* converges in the interval |t| < R to the 
function f(t) if and only if f(t) is analytic for all t such that |t| < Rand 


b= Gf) 
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An immediate consequence of theorem 8.2.2 is that if )°7° 9 by tk =0 for |t| < R, 
then b; = 0 for all ¢ in the interval. We will use this result frequently when 
we solve differential equations by equating like coefficients of two equal power 
series. 

If we cannot use substitution to find a Taylor series expansion (as we did in 
example 8.2.2), it may be possible to use differentiation or integration to do so. 
The following example introduces this approach. 


Example 8.2.3 Find the Taylor series expansion and radius of convergence of 
f(t) =arctant. 


Solution. If we were to attempt to find the series via the definition by taking 
derivatives, we would find that the process becomes laborious after computing 
fi(o=l/a+ t?), since differentiating will involve both the chain and quotient 
rules. Instead, we observe that 


ee 
i= 


itself has a series expansion that is not difficult to find. Similar to our work in 
example 8.2.2, we use the final result in (8.2.4) and substitute —t? for t to write 


1 


_ ce ae ee ee ee oe 
re tek a eae a +(-t°)?+ 


f(ij= 


SjaPap ap fie (1) " 4 (8.2.7) 


Because we now have a series expansion for f’(t), it is natural to integrate both 
sides of (8.2.7) to find the series for f(t). Doing so, we see that 


Lol se ly (=)" pati 
fj atctant=Crt- ot toto +.---+——— ¢ +++ (8.2.8) 


It is a straightforward exercise to use the Ratio Test to show that (8.2.8) 
converges for all t such that |t| < 1. Moreover, since arctan(0) = 0, it follows 
that C=0. 


While intuition guides our work in example 8.2.3, and we certainly know that 
we can integrate any finite polynomial, the one step that is perhaps questionable 
is when we say we will integrate both sides of (8.2.7) to find the series for f(t). 
That this step is legitimate (and that it preserves the radius of convergence) is 
the conclusion of our next formal result, the Taylor series Differentiation and 
Integration Theorem. 


Theorem 8.2.3 If f(t) has the Taylor series expansion 


F(th= byt’, |t]<R 
k=0 
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then its antiderivative F(t) = fe f(x) dx and its derivative f’(t) have the 
respective Taylor series expansions 


oo t oo i 

Fe) = Sob | xKdx= YY ot, It] <R (8.2.9) 
k=0 °° yor 
[o) d [o) 

f(t) =) Tbe lt] dx = DT kb, It} <R (8.2.10) 
k=0 k=1 


That is, theorem 8.2.3 states that any power series may be differentiated 
or integrated term-wise and that doing so does not change the radius of 
convergence of the power series. This fact makes more reasonable our plan 
to solve differential equations by letting y be an unknown power series, taking 
its appropriate derivative(s), and substituting into the differential equation to 
determine the coefficients in the series. 

Finally, it is not always possible to determine an explicit expression for 
the nth coefficient of the Taylor series expansion of a function in terms of n. 
In this situation, we must be content with knowing the values of the first few 
coefficients. For this type of computation, we sometimes abbreviate the tail end 
of a power series by writing 


OP V=ai" + Gait aan (8.2.11) 


where we read the notation O(t”) as “order of t"”. For instance, we could write 


t2 
faltt+ > +0(") 


The next example emphasizes the fact that we cannot always explicitly determine 
a formula for the general nth term in the Taylor expansion of a function. 


Example 8.2.4 Find the first four terms of the Taylor series expansion about 


t = 0 of the function 
t 


er re 


Solution. Because f is the quotient of two functions that are analytic 
everywhere and the denominator is never zero, it follows that f is analytic 
everywhere. In particular, f is analytic at a = 0 and, therefore, has a Taylor 
series expansion there of the form 


t 
e+1 


We know from the standard expansion of e! that 


=ht+bht+htet+het-:- (8.2.12) 


: ae a 
e a eee og oy 
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Multiplying both sides of (8.2.12) by this expression for e' + 1, we obtain the 
identity 


c= (2+ 5+54--) (bo + bit + bot? + bst* +--+) 
Distributing to multiply these two series, we find that 
t= 2ly + (2h + bo)t+ (2b +h +2) P+ (htn+t+2) PH 
In order for this identity to hold, the uniqueness of Taylor series expansions 
established in theorem 8.2.2 implies that all of the coefficients of powers of t on 


the left must equal the corresponding coefficients of powers of t on the right. In 
particular, it must be the case that 


0=2b 
1=2b, + bl 


1 
O= 2b + bi + 5 bo 


1 1 
0=2b,+b,+-b,+-—b 
3 + at 51+ —% 


From this sequence of equalities, it follows that bo = 0, b) = 1/2, by = —1/4, 
and b3 = 0, so that 


1 i) 3 
=_- i+ Of +-- 
2 4 = 


Exercises 8.2 
In exercises 1-4, determine the radius of convergence of the stated power series. 


In exercises 5-17, find the first four nonzero coefficients of the Taylor series 
expansion for each function f(t) about a = 0. In addition, state the radius of 
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convergence of the series expansion. Wherever possible, use known expansions 
and the techniques of examples 8.2.2, 8.2.3, and 8.2.4. 


5. f(t)=Vt41 
6. f(t) = +5t?—3t+8 

1 

7. f(i)= T+ 
8. f(t)=e* 

2t_ | 
ht aaerr 
6.76) 


17. f(t) = arctan e 


In exercises 18—24, find the first four nonzero coefficients of the Taylor series 
expansion for each integral by first finding the expansion of the integrand and 
then integrating term by term.* 


ro 
18. 1 ds 
o l+s4 


t 
a | s° sin s* ds 
0 


2 Your work in exercises 5-17 will be helpful. 
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t 
23. | cos s° ds 
0 


t 
24. / arctan s* ds 
0 


8.3 Power series solutions of linear equations 


In this section, we begin solving linear differential equations by assuming that 
the solution function may be expressed as a power series. To motivate our work, 
we revisit a familiar first-order equation (which we can solve easily by other 
means) to explore how series can be used in this way. 


Example 8.3.1 By assuming that y has a power series expansion of the form 
y(t) = ay +a, t + ant? + a3t? +--+, determine the solution to the initial-value 
problem 

yay, y(0)=1 


Solution. Writing y(t) = a9 + a1t+ a t? +.a3t?+---, we know 
y(t) = a, + 2agt + 3ast? + 4ayt? +--- 
Equating y and y’, we observe that 
dy ate tae bag 1 mt Ber tae (8.3.1) 
Because of the uniqueness of Taylor series expansions (theorem 8.2.2), we may 
equate like coefficients of powers of t in (8.3.1), from which we deduce that the 
following recurrence relation among the coefficients a; must hold: 


ag = 4 
a, = 2a 
an = 343 


an = (n+ 1)an41 
Provided that we know ap, we can find all of the remaining values of a;. Clearly, 
a = y(0), so using the initial condition y(0) = 1, 
1 1 1 
a=1, a=1, a= 5: 43 = 3 R= 39” 
nom this sequence of coefficients and the general recurrence relation ay,41 = 


aaT In We observe that a, = 4 and therefore 


1 1 1 
y(t)=1ltt+—P+ oP +--+ —0" +) 
aye 3 n! 


which we recognize as the familiar power series expansion of y(t) = e’, the 
solution to the IVP y’ = y, y(0) = 1. 
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Obviously there is no need to use power series to solve the IVP given in 
example 8.3.1, as it is a standard linear first-order equation. However, given our 
desire to solve higher order equations that are linear, but for which we currently 
lack a method for obtaining an analytic solution, this example is important since 
we hope to generalize from the simpler first-order constant coefficient case to 
the more difficult second-order non-constant coefficient one. For example, a 
linear second-order differential equation such as 

y" —2ty’+y=0 (8.3.2) 
in which the coefficients of y, y’, and y” are not all constant is not among the 
collection of equations whose solutions we can currently determine. Equations 
such as (8.3.2) belong to a family of equations of the general form 

y" + pty’ + a(t)y =f (t) (8.3.3) 
that we now aspire to solve. 

Before we solve equations of form (8.3.3), we consider one more familiar 
example that introduces other critical ideas that arise when solving linear 
second-order equations through power series expansions. Because we already 
know the solution to the equation we consider, we will be able to check our 
work appropriately and better see the role that series expansions play. 


Example 8.3.2 Solve the initial-value problem 
yt+ty=0, yO)=1, y(0)=1 

by assuming that y has a power series expansion y(t) = ay + a, t+ ant? +430? + 
agt* +... 
Solution. Since y = ay + ayt + apt? +.43t? + ayt*+---, it follows that 

y =a, +2at+ 3a3t* +4agt? ++.» and 

y" = 2a +3-2a3t +4-3ayt? +5-4ast?+--- 
Substituting for y and y” in the given equation y” + y = 0, we have 

(ay+ ayt+ ant? + ast? +agt*+---)+ (2a, +6a3t+ 12agt* +2005? +---)=0 
Gathering terms with like coefficients, 
(ay + 2a2) + (a, + 6a3)t + (ay + 1204)t? +(a3+20a5)t?+---=0 (8.3.4) 


Setting each coefficient of powers of t in (8.3.4) equal to zero implies that the 
following sequence of equalities holds: 


a) = —2ay a = —643 
ay = —12a4 a3 = —20a5 
a4 = —30a as = —42a7 


don = —(2n4+2)(2n+ l)aon42 An41 = —(2n+ 3)(2n4+ 2)arn43 
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We group these equations into the two columns shown for the natural reason 
that the coefficients with even indices depend recursively on one another, as do 
the coefficients with odd indices. Furthermore, we see that if we can identify 
both ap and a, (which we can through the two stated initial conditions), then 
we can determine all of the remaining coefficients. 

Specifically, since y(0) = 1 and ap = y(0), it follows that ay = 1. Similarly, 
with the given condition y’(0) = 1 and the fact that a, = y’(0), we know a, = 1. 
Thus, from the sequence of equalities with even indices above, 


1 1 1 1 
rn Me air a 
di 1 ak 
a = Soe cal gi 
From this and the stated recurrence relation for az, and a2+42, we observe that 
an= (Io, n=0,1,2,.... (8.3.5) 


The formula (8.3.5) implies that the portion of the series expansion for y in 
which all of the powers of t are even will be 


a 1 2 1 4 1 6 


which we recognize as the familiar series expansion for cost. 
Returning to the recurrence relation involving the coefficients with odd 
indices, nearly identical work to that with the even coefficients shows that 
1 1 1 1 1 
q4=1, @=—--, b= 54° a and a7 = —7> a4 =— =, 
These observations imply that the part of the expansion of y involving odd 
coefficients has form 


1 
t 
3! 


1 


sr (8.3.7) 


1 
3 5 
yoa=et tae 


which is sin ft. 
Hence our work with series expansions at (8.3.6) and (8.3.7) has shown that 


— 1 2 1 3 1 4 1 3) 1 6 1 ve 
PAS eae eae, ha gt ag 
1 2 4 le ee ee 
Sse age age agg et ead ag Gee age eo 
= cost+sint (8.3.8) 


Again, it is no surprise that y = cos t+sin tis the solution to the IVP y” + y= 
0, y(0) = 1, y’(0) = 1. We know from our work in several different contexts that 
the general solution to this differential equation is y = c, cost + cy sint, and can 
easily see that the given two initial conditions lead to cj = c, = 1. Even without 
the initial conditions, we could have determined from our work in example 8.3.2 
that y = aycost + a; sint. Regardless, there is a great deal we can learn about 
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series solutions to differential equations by thinking carefully about our work 
in this familiar example. 

First, we saw that in order to get the recurrence relations started, we needed 
to know the values of a) and a). This reinforces the fact that the solution 
space to the second-order equation is two dimensional, and suggests that the 
power series expansion has the property that it detects the need for two linearly 
independent solutions. Next, we observe from our work in example 8.3.2 that 
two different unlinked series solutions arose in the solution; these turned out to 
be the expansions for the cosine and sine functions, respectively, each of which 
has an infinite radius of convergence. This led to the overall solution series being 
convergent for every value of t. Finally, we note that normally we will need to 
be content with expressions that state the first few nonzero terms of a power 
series expansion, as we cannot expect in general to be able to recognize familiar 
power series expansions within solutions, as we did at (8.3.8). 

In general, we will be interested in linear differential equations of the form 


y" + plt)y'+ (ty =0 (8.3.9) 
If p(t) and q(t) are both analytic functions at t = a (that is, both have a Taylor 
expansion at a), then we call t = aan ordinary point of the DE (8.3.9). Otherwise, 
t = aisa singular point of (8.3.9). The following theorem tells us that if t = 0 is 
an ordinary point of (8.3.9), then there exist two linearly independent solutions 
to the DE that may be represented by Taylor series centered at t = 0. 


Theorem 8.3.1 If t = 0 is an ordinary point of (8.3.9), then there exist two 
linearly independent solutions 


CO Co 
n(t)= > ant” and y(t) =) bat” (8.3.10) 
n=0 


n=0 


Both series converge in a disk |t| < R, where R is at least as large as the distance 
from the origin to the nearest singular point of the functions p(t) and q(t). 


In example 8.3.2, the coefficient functions of y’ and y in the DE were 
simply the constant functions 0 and 1, which are each analytic everywhere. 
Theorem 8.3.1 implies that the two series expansions we found (which were 
those of the cosine and sine functions) must therefore converge everywhere. 
We see from this result that anytime the coefficient functions p(t) and q(t) 
are constant, the solution functions that arise must converge everywhere. This 
is not surprising, given our experience that in the case of linear differential 
equations with constant coefficients, solutions essentially consist of the functions 
e, sinkt, and coskt. More generally, we can now state that if p(t) and q(t) 
are polynomial functions, which are also analytic everywhere, then the series 
in (8.3.10) must both converge everywhere. 

We now consider an example involving a differential equation that we are 
unable to solve by other means in order to gain more understanding of the role 
played by infinite series in its solution. 
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Example 8.3.3. Consider the linear second-order differential equation 
y" —2t’ +y=0 (8.3.11) 
Determine two linearly independent series solutions to this equation. Then, 


solve the initial-value problem given by this DE along with the initial conditions 
y(0) =2, y'(0) = =1. 


Solution. We begin by assuming that y = ap + a, t + apt? + 431? +--+. From 
this, it follows 


CO 
y =a, +2ayt +3a3t? +4ayt? +++ =o nant"! 
n=1 
(oe) 
—2ty’ = —2at — Ag t* —6a3t? —8agt? +-+- = —) 0 2nant" 
n=1 
Ce 
y" = 2a) + 6a3t + 12agt? + 20a5t7+--- = >i n(n— 1)a,t" 
n=2 


In many instances, it will be most convenient to work with power series 
represented in the shorthand sigma (X) notation, which is how we will proceed 
from here. Substituting in (8.3.11) with the series expressions for y”, —2ty’, and 
y, we find 


(oe) (oe) [o) 
Y n(n 1agt”? — ) > 2nagt™ + > ant” =0 (8.3.12) 
n=2 n=1 n=0 


In order to equate the coefficients of like powers of f, it is helpful to write each 
series in (8.3.12) using the same indices for the sum. Replacing n with n+ 2 
allows us to write 


Ce CO 
>: n(n—1)ayt" 7? = Yi(nt 2)(n+ lanyot” 
n=2 n=0 


In addition, observe that 


[o.@) CO 
S > 2nant” = > nant” 
n=1 


n=0 
because the term —2na,, vanishes when n = 0. Therefore we can revise (8.3.12) 
to have the form 


CO [o) io) 
Yi(n+2)(n+ Langot™ + > —2nant” + ~ ant” =0 (8.3.13) 
n=0 n=0 n=0 


Now that each series is indexed from n = 0 with corresponding powers of t, we 
can combine the three sums into one and write 


CO 
Y [nt 2)(n+ Langa — 2nay + an]t” =0 (8.3.14) 
n=0 
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Because (8.3.14) implies that every coefficient of the series must be zero, we see 
that the constants a, must satisfy the recurrence relation 
(n+2)(n+ l)an42 —2nay + a, = 0 

or equivalently 

2n—1 
(n+2)(n+1) 
Here it is essential to observe that since the subscripts differ by two in (8.3.15), 
we can obtain two distinct series solutions to the original equation (8.3.11), one 
involving all of the even terms and the other all of the odd ones. In particular, 
considering n = 0, 2,4,..., we have from (8.3.15) that 

1 E) —1-3 d 7 —1-3-7 
a =—-a, 44=——a , and a a4 = a 

ee ae Te aT ee 

More generally, the pattern 


an42 = dn, n=0,1,2,.... (8.3.15) 


—1-3-7++-(4n—5) 


a = 
an (2n)! 
holds and therefore 
a a ee 6 
( — t t t es 
yi(t) = ao 5 40 5% ry had + 
CO 
1234 7+s+(4—5) a, 
= d) — t 8.3.16 
0-4 >> Gat (8.3.16) 
n=1 
Similarly, if we examine the odd terms for n = 1,3,5,... in (8.3.15), we see 
5 1-5 d 9 1-5-9 
a3 = ——a, a5 = a3 = a,, and a7= as = a 
Sg Gee Du Gades 7 aa Te eum te ee 
Thus, we find 
1-5-9---(4n — 3) 
a = a 
2n+1 (Qn+1)! 1 
and therefore 
a ae. 5 
yalt}=at+ Kat Poa aca 
1-5-9-- —3 
mata ) pom) (8.3.17) 


8 


Because y; only involves even powers of t and y2 only involves odd powers 
of f, it is obvious that y; and y2 must be linearly independent functions: it is 
impossible for one to be a scalar multiple of the other. Hence we have found the 
two basic solutions to the given DE and the general solution is 


Y= yi t+a1y2 


2 WO h3-7-(4n— 5) 2 1:5-9-+-(40=3) ong 
= (1 = (2n)! ean (eey een (Qn+1)! 


n=1 n=1 
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Moreover, since p(t) = —2t and q(t) = 1 are analytic everywhere, it follows 
from theorem 8.3.1 that both y; and yz converge for all values of t, as must the 
general solution (8.3.18). 

Finally, if we desire to solve the initial-value problem with y(0) = 2 and 
y'(0) = —1, we need only observe from our beginning assumption regarding 
the series expansion of y that y(0) = ay = 2 and y’(0) = a; = —1. Therefore, the 
solution to the IVP is 


O.1-3-7++-(4n—5) 5, .1-5-9++-(4n—3) 5, 
yaa) Fi ar) - (1450 Gait 12 *) 


n=1 n=1 


In the recurrence relation that arises from assuming that y = aj + ajt+ 
ant* +---, it is not always obvious that two linear solutions to the original linear 
second-order equation arise. Often, we must content ourselves with finding 
the first several terms of the overall general solution and rely on theorem 8.3.1 
to tell us that both have been found. We close this section with an example 
that demonstrates this fact through connections to earlier material we have 
studied. 


Example 8.3.4 Use infinite series to determine the solution to the initial-value 
problem 


y"—2y'-3y=0, y(0)=4, y'(0)=0 (8.3.18) 


Compare your result to the known solution to this IVP which can be found 
without using series. 


Solution. Considering the series expansions for y, y’, and y”, we observe that 
y=antat+agt? +a3t?+---+agt™+--- 
yl =a, +2agt + 3a3t? +4agt? +--+ (n+ lang t™+--- 


y” =2ay + 6a3t + 12agt? + 20a5t? +--+ (n+2)(nt Lanyot™ +> 


From the differential equation y” — 2y’ — 3y = 0, we know that y” = 2y’ + 3y. 
Equating like coefficients from the expressions for y’”” and 2y’ + 3y, we find the 
recurrence relation 


2a2 = 2a, +349 
6a3 = 4a2 + 3a) 
12a4 = 643 + 3a 


20a5 = 8a4 + 343 
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More generally, we can state that for any n > 2, 
_ (2n — 2)an_1 + 3an_2 
= n(n— 1) 


Using the given initial conditions, we find that ay) = y(0) =4 and a; = y’(0) =0, 
and subsequently that 


_ 2a, +3a9  OF12 © 


a= 6 
‘ 2 3 

4a, +3a, 24+0 
a3 => — — 4 

6 6 

643 + 3a 24+ 18 7 
a= — — 
i 12 ig 


and therefore the solution to the IVP is 
7 
yostor +44 ott 


We can confirm that this is in fact the correct solution by solving the IVP 
through another approach and considering power series expansions of the basic 
solution functions. In particular, since the characteristic equation of (8.3.18) 
is r? — 2r —3 = 0 with roots r = 3 and r = —1, the general solution of 
the DE is 


y=qe'+oe! 


It is a standard exercise to show that the values of the constants that satisfy the 
initial conditions are c} = 1 and ~ = 3, so that 


y=e'4+3e7 


If we now employ the standard power series expansion for e' to write series 
expansions for the two solutions present in y, and then combine like terms, we 
observe that 


y=e'43e7 


ot? 2743 ~=—-81t4 342 3834 
=(14+3t+ 7 + 5 + rT +. )+(3-3t+——-—+4—-—... 


t 


12t2 24¢> gat 
= 2! = 3! 4! ™ 


7 
=446P +404 58+. 


which is precisely the power series expansion of the solution we found at the 
outset. 
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Example 8.3.4 demonstrates that although the series form of the solution can 
hide some of the inherent structure in the solution, this approach is nonetheless 
straightforward to apply and will effectively lead us to the power series expansion 
of the solution to a stated IVP. 


Exercises 8.3 
In exercises 1—13, find the first four terms in the Taylor series representation of 
the general solution to the stated DE. 


— 


.¥' +t =0 

Jy" +4y'’ =0 

Jy" +4y =0 

Ly’ +ty=0 

y+ 6y'+5y=0 
y'+y' +4y=0 
y’—y'—6y=0 
y"+ty=0 
(l-t)y"+y=0 
10. (t? —1)y’ —4y =0 
11. y+ 3ty’+3y =0 
12. (t7 +1)y’ —2y =0 
13. (1 — t?)y’ — 12ty’ — 18y =0 


oe ND WwW Fw bd 


In exercises 14-17, find the first four nonzero coefficients of the Taylor series 
expansion for the solution to the stated IVP. 


14, (4—2?)y"+2y=0, y0)=0, y'(0)=1 
15. y’+(1—-t)hy=0, y(0)=1, y(0)=0 
16. y’—t?y’+ysint=0, y(0)=0, y'(0)=1 
17.y'+ysint=0, y(0)=1, y'(0)=0 


8.4 Legendre’s equation 
A differential equation that arises naturally in physics, particularly when using 
spherical coordinates, is the Legendre equation, 

(1— t*)y” — 2t// +4144 Dy =0 (8.4.1) 
The parameter A is often a positive integer, though it is allowed to be any real, 
non-negative constant. If we divide both sides of (8.4.1) by 1 — ft? to write the 
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equation in standard form y” + p(t)y’ + q(t)y = 0, we have 


2t a(A+1) 
7 : =O 8.4.2 
fa ee (8.4.2) 
With 
2t MA+1) 
pi) =o a and VS a 


it follows that the origin is an ordinary point of Legendre’s equation and the 

nearest singularities lie at t = +1. We therefore expect that we can find Taylor 

series expansions about t = 0 for each of the two linearly independent solutions 

of (8.4.1), and the radius of convergence of each such series will be at least 1. 
To solve the Legendre equation, we assume that 


CO 
y(t) = ys Ant” 
n=0 


and consider the three terms present in the DE: (1— t?)y”, —2ty’, and A(A + L)y. 
Letting a = A(A + 1) and writing each of these expressions in their series 
expansion, we have 


(oe) (oe) (oe) 
(1— t?)y"= (_- ?)S > n(n— Want”? =) n(n— Hag —~S n(n 1)ayt” 
n=2 n=2 n=2 
(oe) Ce 
=) (n+2)(n+1)anp2t"— > n(n Vant” (8.4.3) 
n=0 n=0 
(oe) CO (oe) 
—2ty’= —2t) > na,t" | =) -2na,t" =) -2na,t" (8.4.4) 
n=1 n=1 n=0 
CO 
ay= Yaant” (8.4.5) 
n=0 


To achieve the final expression for (1 — t?)y” in (8.4.3), we re-indexed the first 
sum by letting n be replaced by n+ 2 and lowering the index, and re-indexed 
the second sum by noting that when n = 0 and n = 1, the coefficient n(n — 1) 
vanishes, so starting at n = 0 is the same as starting at n = 2. Likewise, for the 
expression for —2ty’, the term na,,t” is zero when n = 0, so we can start the sum 
at n = 0 instead of n = 1 in (8.4.4). Thus, all three series are written in terms of 
powers of ¢” starting at n = 0. 

Next, to satisfy Legendre’s equation (8.4.1), we take the series expressions 
in (8.4.3), (8.4.4), and (8.4.5) and set their collective sum to zero. Doing so, 


0=(1—f*)y” —2ty +ay 


(oe) 


CO CO (oe) 
= S\(n+2)(n+ any. t” — Yo n(n- l)a,t” + YS 5 -21nant” + Sea 


n=0 n=0 n=0 n=0 
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= ey [(n+ 2)(n+ 1l)ang2 — (n(n — 1) +2n—@)ay] t” 


i 
Oo 


= Jo [(n+2)(nFt Langa — (0? + n—a)ay] mM (8.4.6) 


n=0 
We thus observe (8.4.6) implies the recurrence relation 
(n+2)(n+ l)an42 —(n? +n—a) ay =0 (8.4.7) 
Recalling that « = A(A +1) =A?+A, we may write 
w+n—-a=wt+n—i’—-A=(n—A)\(n+A4]) (8.4.8) 
Hence, (8.4.7) and (8.4.8) together show 


(n—A)(n+A+41) 
(n+2)(n+1) 


As we have seen in certain other DEs, the recurrence relation (8.4.9) makes all 
of the even coefficients in the expansion for y depend on ao, and all of the odd 
coefficients depend on a). Assuming that aj = 1 and computing the first few 
even coefficients, we find that 


ay (8.4.9) 


ant2 = 


= gg MOHD, | 2-NG+A) | 
so that one solution to the Legendre equation is 
1 1 
y(t) =1— ia aay eee ria +1)(A—2)(A+3)t*+--- (8.4.10) 


Similar computations for the odd coefficients with a, = 1 results in the function 
1 1 
y(t) =t— qi @-DA+2)P +o (a 1)(A=3)(A+2)(A+4)0? +++» (8.4.11) 


The solutions y; and y2 are clearly linearly independent and therefore form a 
basis for the set of all solutions to the Legendre equation. Note particularly that 
each depends directly on the parameter i, as the Legendre equation is actually a 
family of equations where each equation depends on d. In our development of y 
and y2, note that we assumed ap = 1 and a; = 1, which is equivalent to assuming 
that y(0) = 1 and y’(0) = 1. The general solution of the Legendre equation is 
y = a0y1 + a y2, where y; and y2 are given by 8.4.10 and 8.4.11, respectively. 


The case when A is a non-negative integer is particularly interesting. From the 
recurrence relation (8.4.9), whenever A = n, it follows that a,42 = 0 and hence 
An+4, 4y46,--. are all zero. Since this causes the series expansion of y, or y2 
to terminate, one of the resulting solutions to the differential equation is a 
polynomial. In particular, if A is an even integer, say A = 2m, then y(t) is a 
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polynomial of degree 2m. For example, 
A=0: yi(t)=1 
A=2: p(t) =1-32? 
35 
A=4: y(t)= =e 
Similarly, in the case where A = 2m + 1 is an odd integer, y2(t) is a polynomial 


of degree 2m + 1. The first few examples for small values of 4 are 


A=1: yo(t) =t 


5 
A=3: y(t)=t—3P 


14g. OD g 

A=5: y(th=t ee 
These polynomials demonstrate that when A is non-negative integer, at least one 
basic solution of the Legendre equation is a polynomial function. Moreover, 
since the Legendre equation is linear, any scalar multiple of a solution is also a 
solution, so we can scale these polynomials however we like. Doing so to make 
the polynomial’s value 1 when t = 1 results in the family of polynomials 


Po(t) =1 
P\(t)=t 

P(t) sf 

P3(t) = oo 

Pa(t) = =H ; gee : 
me = rc a om 


The polynomials P,(t), which can also be described through a recurrence 
relation linking P,42 to Py41 and P,, are known as the Legendre polynomials 
and form a well-known class of so-called orthogonal polynomials. The Legendre 
polynomials have many interesting properties, including the fact that each 
has n real, distinct roots that lie in the interval (—1,1) and demonstrate an 
oscillatory behavior similar to the graph of P,,(t) shown in figure 8.1. The 
study of orthogonal polynomials has important ramifications in many areas of 
mathematics and physics, but lies beyond the scope of this text. 

Regardless of whether A is a non-negative integer or not, the two infinite 
series expansions for yj and y2 in (8.4.10) and (8.4.11) are the two linearly 
independent solutions of the Legendre equation. In the case where A is a non- 
negative integer, we have shown that one of these two infinite series terminates 
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== 


Figure 8.1 The degree 11 Legendre polyno- 
mial, P);(t). 


to form a polynomial, one of the Legendre polynomials. The other solution 
turns out to have recognizable structure as well. 

For instance, when 4 = 0, we know that one solution to the Legendre 
equation comes from y(t) = 1 = Po(t). Setting 4 = 0 in y(t), it 
follows 


1 1 
Soph a (8.4.12) 


Thus, when A = 0, a second linearly independent solution is given by Qo(t) = 


t 
for any non-negative integer 1 = n, a related expression involving Qp exists 
for the second linearly independent solution Q, that is not a polynomial. In 
particular, these functions are known as Legendre functions of the second kind; 
the first several of these functions are given by 


5 In (ea) and we write y = cy Pp + © Qo. More generally, it can be shown that 
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Qs(4) = PsQ(H) = FP +5 


Q(t) = PN) — PP +> 


Note that the presence of Qo(t) in each solution highlights the fact that 
singularities are present in the Legendre equation at t = +1. The functions 
P,(t), P2(t), ... are the previously noted Legendre polynomials. Further, the 
general solution of the Legendre equation with A = n > 0 is therefore 


y(t) = 1 Pa(t) + 2 Qu(t) (8.4.13) 
We close this section with an example. 
Example 8.4.1 Find the solution of the initial-value problem 
(1—t?)y’ —2ty'+12y=0, y(0)=1, y(0)=1 
Solution. First, observe that the given DE is Legendre’s equation with A = 3, 


since 3(3 + 1) = 12. From our earlier work in this section, we know that the 
general solution is 


y(t) = ¢ P3(t) + 2 Q3(t) 


= ¢P3(t) + & (Pacsyavce — 4 4 =) 


= P3(t)(q. +O Q(t) +e (-3# + ; 


54-3 5, 
= (3: ->)( a+ Sin) 4a(-3e a +3) (8.4.14) 
y'(0) = 


Applying the initial conditions y(0) = 1 and 
show that c) = —2/3 and c) = 3/2, and thus 


OY Me oe ee 
ee _ — n ——a 
Yr\9" 9 a ae) A 


is the solution to the given IVP. 


1 to 8.4.14, we can 


Exercises 8.4 


1. Verify by direct substitution that the Legendre equation is satisfied by the 
polynomials P2(t) and P3(t) when A = 2 and A = 3, respectively. 


2. Verify by direct substitution that Qo(t) = $In(1 + t)/(1 — f) isa solution 
of Legendre’s equation with A = 0. 
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3. Determine the Taylor series expansion about a = 0 of 
fo= $In(1 + t)/(1 — ft) and confirm that this matches (8.4.12). 


4, Determine expressions for Pg(t) and P7(t). 


In exercises 5-7, find the general solution of the stated differential equation in 
terms of P,,(t) and Q,(t). (Hint: Use the method of undetermined coefficients 
in the standard way to find a particular solution of each equation.) 


5. (1—t*)y”’ —2ty +6y =6 
6. (1—t7)y” — 2ty’ + 20y = 36t 
7. (1—#?)y” — 2ty’ + 30y = 1227 


In exercises 8-17, find the first four nonzero coefficients of the Taylor series 
expansion (about t = 0) for the solution to the stated IVP. 


8. (1—t?)y” —2ty’+2y=0, y(0)=1, y/(0)=0 
9. (1—t?)y” —2ty’'+3y=0, y(0)=1, y/(0)=0 
10. (1—t?)y” —2ty’+20y=18t, y(0)=0, y/(0)=1 
11. 9(11—t?)y”—18ty’+4y=0, y(0)=0, y/(0)=1 
12. (1—t?)y" —2ty/ +20y=0, y(0)=1, y'(0)=1 
13. (1—t?)y” — 2ty’ + 20y = 1427, (0) =3, y’/(0)=1 


8.5 Three important examples 


In this penultimate section on series solutions to differential equations, we 
consider and discuss three examples that arise in applied physics. 


8.5.1 The Hermite equation 
The Hermite equation is the linear second-order differential equation given by 
y" —2ty’+2qu=0 (8.5.1) 


where q is a real constant. Using the Taylor series expansions for y, y’, and y” 
in the usual way with y = ay + a, t + at? +---, it can be shown that 


[o.@) 
Yi[n+ 2)(1+ Dango — 2(n = g)an]t” = 0 (8.5.2) 
n=0 
from which follows the recurrence relation 
Pree a, Oe ee (8.5.3) 
(n+1)(n+2) 


As we have seen in previous examples, the even-subscripted coefficients 
depend on y(0) = ao, and the odd-subscripted coefficients involve y’(0) = ay. 
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To calculate the first few nonzero terms in the expansions for the solution y; (f) 
involving even powers of t, we observe that 


pg NOM s,s ty 
4 To 2! 
ree 
fp eed) 
3.4 4! 
2(4— q) 3 4(2 — q)(4— q) 
G6 =-—2 
5-6 6! 


More generally, it follows that 


_ 42-9) Ok—-2—4) 
(2k)! 


ak = (8.5.4) 
If we elect to use the initial conditions y(0) = 1 and y’(0) = 0, this implies that 
a = 1 and a, = 0; the latter condition and the recurrence relation (8.5.3) imply 
that all odd-subscripted coefficients are zero, and hence one solution to the 
Hermite differential equation is 


y(t) = ap tattat?+--- 


w) 2? q(2— 

re Lae q( Da... 
2! 4! 
an (2 — 4G) (2n— 2-4) 

=1-— pee Pr" (8.5.5) 
a (2n)! 


Using similar reasoning with odd-subscripted coefficients, (8.5.3) implies 


2(1—q) 
a= 7.3 a 
a5 = a a =p 
oe 20D a, =p C96 Er 9, 


From this, we can deduce that the general odd coefficient is given by 
(1 — q)(3—q)---@k—1—4) 
(2k+ 1)! 


Using the initial conditions y(0) = 0 = ag and y’(0) = 1 = ay, a second solution 
to the Hermite equation is 


Aap) = 2* ay (8.5.6) 


oe = —q)-:- =] = 
y(t) = r+ yran De ere D) j2m+1 (8.5.7) 


n=1 
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Since y(t) and y2(t) are linearly independent, the general solution to the 
Hermite equation is 


yHayit+ oy 


= (: Br ott ) 


n=1 


all—@)(3—q@)---Qn—-1—-@4) oyej 
tarry ESI t (8.5.8) 


Just as we experienced with Legendre’s equation, there are values for the constant 
q in the Hermite equation that lead to polynomial solutions. In particular, the 
presence of the factor (2n — 2 — q) in y;(t) implies that whenever q is an even, 
non-negative integer, then y; (t) is a polynomial. Specifically, from (8.5.5), when 
q=0, q= 2, and q = 4, it follows that 


q=0: yi(t) = 
q=2: yw(t)=1-20 (8.5.9) 


4 
q=A4: VA pee 


Similarly, for q= 1, q = 3, and q=5, the function y2(t) that is a solution to the 
Hermite equation is found to be 


q=1: y2(t) =t 


2 
q=3: y(t=t-5P (8.5.10) 


4 3 4 5 
= 5: t)=t—-—-t —t 
q y(t) 5 HG. 


The polynomial solutions to Hermite’s equation given in (8.5.9) and (8.5.10) 
are usually called the Hermite polynomials H,(t) when scaled such that the 
coefficient of the highest power of t is 2”. The first four Hermite polynomials are 


Ho(t)=1 
H(t) = 2t 
p(t) =4t? —2 


H3(t) = 8t° — 12t 


The Hermite polynomials are another example of a family of orthogonal 
polynomials; Hermite polynomials are orthogonal on (—oo, oo) with respect 
to the weighting function w(t) = e-!. Like Legendre polynomials, they have 
a wide range of interesting properties and the possibilities they present for 
further study go well beyond the scope of this text. A plot of Hj;(t) is shown 


in figure 8.2. The Hermite polynomials have large oscillations; the degree 11 
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-10° 


Figure 8.2 The degree 11 Hermite 
polynomial, Hj;(t), plotted on the 
interval [—3, 3]. 


polynomial has two more zeros, located at approximately +3.7, which are not 
shown in figure 8.2. 


8.5.2 The Laguerre equation 


The Laguerre equation is given by 
ty’ +(1—-t)y’+qy=0 (8.5.11) 


where q is, once again, a real constant. If we divide through by t, Laguerre’s 
equation is equivalently expressed as 
1-t 
y'+——y' +4 y=0 
t t 
Since the coefficient functions p(t) of y’ and q(t) of y are each undefined at 
t = 0, the Laguerre equation has a singular point at the origin. Nonetheless, 
it turns out that we can find a series expansion for a solution at the 
origin. 
Letting y = ay + a;t+ ant? +--- and substituting for y, y’, and y” in (8.5.11) 
it can be shown that the coefficients a,, must satisfy 
[oe 
So [(at+ 1? angi + (q— 2)an] t"=0 (8.5.12) 
n=1 
It follows from (8.5.12) that 
(n+ 1) anpi + (q—1)an =0 


and therefore 


q-n 
an+1 = = ae (8.5.13) 
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Note that this recurrence relation relies only on the value of ag, and therefore 
only leads to one solution to the Laguerre equation.’ Applying (8.5.13), we see 


q 
a 42.40 
q—1 
m= ASG UX 
q-2 
=— =— =I(g=1 
a3 g > oy to 
More generally, 
q—n—1 (q—n+1)---q—-l)4 


an = — 2 a= k=l) a0 
n 


n!2 


Taking ap = 1, we have found that one solution to the Laguerre equation is 


OSE Cy a (8.5.14) 


n2 
n=1 


When q is a non-negative integer, we see from (8.5.14) that yi(f) is a 
polynomial of degree q. Recalling the binomial coefficient (4) given by 


(= q@ _ q(q—-1)---(q—n +1) 


rs ae 5 (8.5.15) 


we are able to find a relatively simple expression for these polynomial solutions. 
The Laguerre polynomial of degree q is given by 


a4! —( ‘en (8.5.16) 


and these functions turn out to be the only solutions (up to scalar multiples) of 
the Laguerre equation that are analytic at t = 0. The Laguerre polynomials are 
yet another family of orthogonal polynomials. The first few of these polynomials 
are given below, followed by a graph of L);(t) in figure 8.3. 


Ii(t)=1-t 
leg 


B. B das 
BiVQet= {pt 7 Set 
3(t) 5S Z 


Z 1 
Lap SiS4r3r Sf pF 
4(t) + 3 Fa 


3 A second solution can be found by more sophisticated techniques that lie beyond the scope of this 
book. 
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15 


10 


—5 


Figure 8.3 The degree 11 Laguerre poly- 
nomial L;;(t) plotted on the interval 
[0, 10]. 


8.5.3 The Bessel equation 
The Bessel equation 

ty" + ty’ +(t?7-27)y =0 (8.5.17) 
is a very important DE in mathematical physics. The properties of its 
solutions have been well studied; the equation often appears in the process of 
solving certain partial differential equations that appear when using cylindrical 
coordinates. 

The parameter A in (8.5.17) is a real constant. Like the Laguerre equation, 
the Bessel equation has a singular point at t = 0, so we cannot expect to find 
solutions to the equation with Taylor series centered at a = 0. Nonetheless, as 
we will show shortly, a solution analytic at t = 0 exists when A is a non-negative 
integer. While a second linearly independent solution to the Bessel equation can 
be found, the techniques required are beyond the scope of this text. 

Here we only explore the series solutions that do exist for the Bessel 
equation. Let 4 = m be a non-negative integer and assume that y,(t) = 
ay + ait + ant? +--+. Substituting directly in (8.5.17) leads to 


CO 
—n’ayg+(1—m)at + Soe — mar + ap_2]t* =0 (8.5.18) 
k=? 
Since each coefficient of powers of t in (8.5.18) must be zero, it follows that 
mag = 0, (1 — m?)a, = 0, and 


(k* — m*)ap + ap» =0, k>2 (8.5.19) 


If k < m, then it follows a, = 0 for each such k by the three preceding 
equalities. When k = m, the coefficient k? — m? of a; vanishes and thus (8.5.19) 
becomes the identity, rendering the value of a,, arbitrary. Note further that 
Am+1 = 4n+3 = ++: = 0 is another consequence of (8.5.19). Thus, a,, can be any 
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constant, and subsequent terms must satisfy the recurrence 
1 1 
a, =— 
(K+ 2)? m2 (k+2=m)(k-+2-+m) 


, k=m,m+2,m+4,... 


(8.5.20) 
Hence, given a positive integer A = m and a value for am, we can determine 
all of the coefficients of the Taylor expansion of an analytic solution to the 
Bessel equation. In particular, these coefficients Am+2; for j > 0 must satisfy 
the recurrence relation (8.5.20), from which using aj, = 1 we find the closed 
formula 


ak42=— 


45 (-1) 

j\(m+ 1)(m+ 2)-+-(m+)j) 
Hence, one solution of Bessel’s equation (again, when 4 = m is a positive 
integer) is 


tpg 2 (8.5.21) 


a (-1) 
= 2j m+2j 
n(t)= y 2 APG aGaD. (8.5.22) 


j=0 


The Bessel function of the first kind of order n (it is standard to use n rather than 
m for the order of the Bessel function) is the scalar multiple of y(t) given by 


a ee ee ee 
In(t) = = n= 7 i feat j (8.5.23) 


For example, the first two Bessel functions are 


Jo(t) = yee (8.5.24) 
= 1! 
and 
oe -1V 
h(t)= a ear sill (8.5.25) 
JF 


The graph of Jo(t) in figure 8.4 shows that the Bessel function exhibits damped 
oscillation. 


In this section, through the Hermite, Laguerre, and Bessel equations, we have 
encountered examples not only of three important DEs, but also of the various 
types of important functions that arise as solutions to these equations. Hermite 
polynomials, Laguerre polynomials, and Bessel functions are often studied 
in courses on special functions and demonstrate a wide range of interesting 
properties that mathematicians, engineers, and physicists have studied. 


Exercises 8.5 


1. Determine the degree 4 and 5 Hermite polynomials, H4(t) and Hs(t). 
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0.85 


0.45 


—0.44 
Figure 8.4 The Bessel function of the 
first kind, Jo(t). 
In exercises 2—4, find the first three nonzero terms in the Taylor series 
representation of the general solution to the given Hermite equation. 
2. y’ —2ty’ + 6y =0 
3. y’ — 2ty’+10y=0 
4. y" — 2ty’+4y=0 


In exercises 5-7, find the first three nonzero terms in the Taylor series 
representation of the general solution to the given IVP. 


5. y"—2ty’+6y=0, y(0)=2, y'(0)=10 
6. y""—2ty'+10y=0, y(0)=1, y'(0)=0 
7. y"—2ty'+4y=8t, y(0)=1, y'(0)=0 
8. Determine the degree 5 and 6 Laguerre polynomials, s(t) and I¢(t). 


Given that a general solution of Laguerre’s equation is c, Lg(t) + c2u2(t), where 
u(t) is singular at the origin, in exercises 9-11, determine the solution to the 
given IVP. 


9. ty”+(1—-t)y'+3y=0, y(0)=finite, y(1)=1 
10. ty” +(1—t)y’+4y=0, y(0)=finite, y(2)=2 


ll. ty” +(1—t)y’+4y=3t, y(0)=finite, y(1)=4 


12. Determine the first five nonzero terms in the series expansion of 
Jo(t) about t = 0. In addition, state the form of J2(t) in sigma 
notation. 


It can be shown that a second linearly independent solution to the Bessel 
equation when A = n (called the Bessel function of the second kind of 
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order 1 is given by 
2 t 
Yn(t) = Inf t) (in 5 +P v) + R(t) + u(t) 


where R(t) is a rational function, y © 0.577215665 is Euler’s constant, and u(t) 
is a power series convergent for all t. Note that Y,,(t) is singular at the origin. In 
exercises 13-15, determine the general solution to the given equation. 


13. t?y" + ty’ + (t? —4)y =0 

14, t?y" + ty’ + (1? —9)y =0 

15. ty" + ty’ + (t? — 16)y=0 

In exercises 16—18, determine the solution to the given IVP. 
16. t?7y"+ty’'+(t?—4)y=0, y(0)=finite, y(1)=1 
17. ty" + ty’ +(#-9)y=0, y(0)=finite, y(1)=—3 
18. t7y"+ ty’ +(t?—16)y=0, y(0)=finite, y(1)=2 


8.6 The Method of Frobenius 


Some second-order linear DEs that appear in physical applications do not have 
two linearly independent analytic solutions about t = 0. Perhaps the most 
important and well-studied example is the Bessel equation (8.5.17). A somewhat 
simpler example is 
2, 3 / 1 

ty + a7 5 (8.6.1) 
which is a Cauchy—Euler equation (on which more information can be found 
in section 4.7.3). It is a straightforward exercise to show that for all t > 0, 
yi(t) = t~! and y2(t) = a/t are linearly independent solutions of (8.6.1). Note 
that neither y; nor y2 has a derivative at the origin, and therefore neither is 
analytic at t = 0; thus, each lacks a Taylor series expansion at the origin. 

F. Georg Frobenius (1847-1917) showed that a certain class of linear 
second-order DEs with a singular point at the origin can be represented in 
series form by a slight generalization of a Taylor series. In particular, he showed 
that these series solutions have the form 


ora) ra) 
y=t" = byt = a b,tktr (8.6.2) 
k=0 k=0 


where r is a real number and )-?, bxt* converges in some open interval 
containing the origin. The series (8.6.2) is called a Frobenius series, and the 
following method we will discuss for obtaining r and the coefficients b, is 
known as the Method of Frobenius. 
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The Cauchy—Euler equation and the Bessel equation both belong to this 
class of equations that can be solved by the Method of Frobenius. In what 
follows, we focus particularly on equations of the form 


ty" + tp(t)y’ + q(t)y =0 (8.6.3) 


where p(t) and q(t) are low-degree polynomials. Note that p and q are analytic 
at the origin, and therefore each has a convergent Taylor series there. Any linear 
second-order DE with this property is said to have a regular singular point at 
the origin. The Method of Frobenius applies to all such equations. Finally, 
observe that if p(t) and q(t) are constant polynomials, then (8.6.3) reduces to a 
Cauchy—Euler equation. 

To begin, we suppose that there is a solution of (8.6.3) that has a series 
expansion of the form 


CO 
i= yb (8.6.4) 


where by 4 0 and S779 bg t* converges in 0 < |t| < R. From this, it follows that 


[o,0) 
y= v(k+ ry bgt (8.6.5) 
k=0 
and 
[o,0) 
"= Sk rk r— bgt? (8.6.6) 


k=0 
Furthermore, we suppose that p(t) and q(t) have the expansions 


p(t) =potpit+pot+---+ ppt 


q(t)=g+qtt+mtt+---+qi,+- 
Substituting these expressions for y, y’, y’”, p, and q in (8.6.3) and gathering like 
terms, we find that 
O= ty" + tp(t)y’ + 4(ty 
[oe 


(kt r)(k+ 1 —1bpt!*" + (po+ pit + pot +s + prt) 
k=0 


(oe) (oe) 
x Sok) bgt + (go + ait + got te tate) DY) bgt? 
k=0 


=(r(r—1)+port+q)bot+at+at? +: (8.6.7) 


where the general term c, depends on n and all earlier coefficients for each n > 1. 
A general formula for c, turns out to be complicated and not particularly useful 
for the examples we wish to study, so we choose not to derive such a formula. 
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The most important conclusion to draw from (8.6.7) comes from the fact 
that each coefficient of the general power series expansion must be zero, so that 
since by # 0, 

r(r—1)+ por+ qo =0 (8.6.8) 
Equation (8.6.8) is called the indicial equation for the Method of Frobenius. 
Note that this equation is quadratic in r; its two roots are the values of r that 
are used in (8.6.2). At this point, it is useful for us to turn our attention to two 
specific example of the Method of Frobenius at work. 


Example 8.6.1 Find a Frobenius series solution for the Bessel—Clifford 
equation 

ty” +(1—a)ty’+ty=0 (8.6.9) 
where a is a constant. 


Solution. With a being a constant, we have p(t) = 1 — a, so in the series 
expansion for p, po = 1 — a. Moreover, q(t) = t, so qo = 0. Thus, for the given 
DE the indicial equation is 
r(r—1)+(1—-a)r=0 

Rearranging, we see that r(r — 1+ 1-— a) =r(r—a) =0, and thus the roots of 
the indicial equation are r= 0 and r=a. 

In the case that r = 0, the Method of Frobenius is providing an analytic 
solution to (8.6.9) of the form 


[o.@) 
n= > byt* 
k=0 


Dividing both sides of (8.6.9) by t and substituting this expression for y using 
the standard series methods we have already discussed, it follows that 


CO 

Yok + I(k+ 1 = a) begs + belt 

k=0 
from which we obtain the recurrence relation 

—l 
b 

(k+1)(k+1—a) * 
It follows from (8.6.10) that the closed form expression for by is 
_ (—1)* 
~ kM —a)(2—a)---(k—a) 


ben = (8.6.10) 


by bo, k>1 


so we find that 


= — (—1 k 
nO=(14 aaa") (8.6.11) 


which is valid for all t provided that a 4 1, 2, .... Note that from this recurrence 
relation, every b, is a function of bp, and thus there cannot be two linearly 
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independent solutions to the Bessel—Clifford equation that are analytic at 0. 
Indeed, every solution linearly independent of y; (t) must be singular at 0. And 
while the equation has a singular point at the origin, there is an analytic solution 
there for every a except when a is a positive integer. We now turn to the other 
root of the indicial equation in search ofa second solution to the Bessel—Clifford 
equation. 

Using r = a, we have 


CO 
ty(t) = 1m by pktatl 
k=0 


(1—a)ty'(t) = ) (1 —a)(k+ a)byt*4 
k=0 


(oe) 
ty" (t) = S(k+ al(kt+a—1)byt**4 
k=0 


Adding these equations forms the left side of the differential equation we aspire 
to solve; doing so and simplifying, we find that 


[o,e) CO 
O=t?y"(t) + (1—a)ty(t) + y(t) = Dkk +a) bgt t? +) bgt) 
k=0 k=0 


Since the first term in the first sum is zero, if we adjust the index of the summation 
in the second sum and combine, we have 


[o,2) 
DTK + a) be + belt = 0 
k=1 


from which it follows that 
k(k+ a)bp t+ by-1 =0, k= 1 


This standard recurrence relation can be solved to write every by in terms of bo. 
Indeed, we see 


7 (-1)* 
~ kK +a)(2+a)---(k+a) 


so that the Frobenius series representation of the solution is 


_ a -. (-1k k 
yx(t) = bot >» EWA Cea ear a (8.6.12) 


We close this example with a few important observations. First, if a = 0, then the 
Frobenius solution y(t) is identical to the earlier obtained y;(t). Moreover, if a 
is anon-negative integer, then the Method of Frobenius produces a Taylor series 
expansion that is analytic at t = 0. Thus, the cases for a valid analytic solution 


bx 


bo, k>1 
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excluded by our approach in finding y(t) are here reconciled. Finally, if a is not 
an integer, then y2(t) is singular at t = 0 and, together with the analytic y;(t) 
given by (8.6.11), we have found a linearly independent set of solutions for the 
Bessel—Clifford equation valid for t > 0. 


To complete this section, we consider a second example. 


Example 8.6.2. Find a Frobenius series solution of Bessel’s equation, 


try” + ty’ +(t?—a*)y=0 (8.6.13) 


Solution. In section 8.5.3, we derived a solution to (8.6.13) in the case where 
A is an integer. Thus, in what follows we assume that 4 > 0 is not an integer. 

Since p(t) = 1 and q(t) = —A? + t?, we have py = 1 and qy = —A?, which 
tells us that the indicial equation is 


(r—-D+tr—-Var-17=0 


Thus, r = +A. Choosing r = 4 and using (8.6.4), (8.6.5), and (8.6.6), we find 
that the three relevant series for the differential equation (8.6.13) are 


lo.) lo, 0) 
(2 — 22)y(t) _ S- byt t+? _ S- byt htt? 
k=0 k=0 
[o, 0) 
ty’ (t) = S(kK+A)bgtht* 
k=0 


le, 0) 
ty" (t) =D (k+A)(K+A— Let? 
k=0 


From the form of Bessel’s equation, the sum of these three expressions vanishes; 
adding and simplifying, we observe that 


[o) (oe) 
Do kk + 2A) bgt — Sb? = 0 
k=0 k=0 


To combine the sums, we step up the index in the second summation by 2 
and find 


[e, 0) 
(1+ 2a)bit!** — ST k(k + 2A) bp + Dp_21t* = 0 
k=2 
So, (1+ 2A)b; = 0, and 
k(k +22) bp + by_2 =0, k > 2 (8.6.14) 


One solution to this recurrence relation is obtained by setting bp = 1 and b; = 0. 
Then, since we are assuming that A is not an integer and b; = 0, (8.6.14) implies 
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that all odd-subscripted coefficients are zero and that 


—l 
by = —————- 9. 0, k=2,4,... 
eRe) 
Therefore, it follows that in closed form we have 
(—1)k2-2k 


bok = FL pA) +A) (REA) 


and thus a Frobenius solution to the Bessel equation is 


(1) =P (=i)? pak+a 
at “EFA QA) (KR) 


Note that since A > 0, the ratio test can be applied to show that this series 
converges for all values of t. 


Amore detailed study of the Method of Frobenius is beyond the scope of this 
text. (For further discussion, see Potter and Goldberg, Mathematical Methods, 
second edition, Great Lakes Press 1995.) 


Exercises 8.6 

In exercises 1-10, find the indicial equation and use the root that either is not 
an integer or that is the larger integer to find the first three nonzero coefficients 
in a Frobenius series solution to the given DE. 


1. 2¢y"”— ty’ +(1+t)y=0 

2. 2ty” + y+ ty=0 

3. ty” + (t—2)y’+y =0 

4. 2ty”+(1+4t)y’+y=0 

5. ty" — t(t+5)y' + (t+5)y =0 

6. 2t?y” — ty’ +(t-—5)y=0 

7. 4t?y" + 6ty’ +(t—2)y=0 

8. 2ty”+(1—-t)y’-y=0 

9. ty” + ty’ + (t—3)y =0 
10. 3t7y” — ty’ —4y =0 
11. Find the indicial equation for the Cauchy—Euler equation 

ty" + pty’ +ay=0 


12. Show that the roots of the indicial equation are equal for the Laguerre 
equation 


ty” +(1—t)y +qu=0 
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8.7 For further study 


8.7.1 Taylor series for first-order differential 
equations 


Let y(t) = }°°°.9 ant” be the Taylor series of a solution of 
ty +ay=fi(t) (8.7.1) 
where A is constant and f(t) = © ae: 
(a) Show that 


(b) In terms of the infinite series derived in (a), what is the general solution 
to (8.7.1)? 


(c) Using series expansions appropriately and your work in (a), determine the 
general solution to each of the following DEs. 


(i) ty’ +2y=e!' 
(ii) ty’ ++ 3y = sint 
(iii) ty’ + 4y = arctant 


(d) Show that 


t ~ 
t= to pinta A n+ir-1 

— a i d 

Be Beta Spat 
t 
ia 421 a A-1 
— y x"dx=t~ ie (x) dx 
= f's Jf 0 


(e) Substitute directly in (8.7.1) to ase is 
t 
yt) [Pf a 
0 


(f) Solve (8.7.1) by use of an integrating factor (see section 2.3) and compare 
your result to y(t) as given in (e). 


is indeed a solution. 


8.7.2 The Gamma function 


The Gamma function I'(x), like Bessel functions and families of orthogonal 
polynomials, is a special function that plays an important role in many areas of 
mathematics. The Gamma function is defined by 


CO 
re+n= | ef dh s>-l (8.7.2) 
0 
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a) Show that (1) = 1. 


( 
(b 
(c 


(d 


Use integration by parts to show that (s+ 1) = sT(s). 


Show that if s is a positive integer, then '(s) = s!. 


SS VS Da YH 


Let r > 0 be given and recall that £[t"] = [>° e~t" dt. Hence show that 


(e) Show that 


(f) Use (b) to show that 
wU(h+x/h) 7 
h Fash) =x(x+h)(x+2h)---(x+(n—1)h) 
Hence, show that 


aM (n+1/2) 2" 
1-3-5---(2n—1)=2 Fd/2) ra ee a 
(g) Finally, explain why 1-3-5---(2n— 1) = (2n)!/(2"n!) and therefore show 
r(n+>) Gol Je 


2) 27H! 


A 


Review of integration techniques 


Several standard solution techniques for differential equations require us to 
integrate functions. Here we briefly review some fundamentals from calculus. 


u-substitution 


For integrals of the form 


[reengina 
we can evaluate the integral by undoing a chain rule through a change of 
variables. Letting u = g(t), it follows du = g’(t) dt, and thus 


[feos Gil e ) du 


If we can evaluate the new, simpler integral in u, all that remains is to substitute 
back to the variable t. For instance, to evaluate 


/ tsin t? dt 


we let u= t? and du = 2t dt. We note that t dt = 5 du. Thus, substituting for t? 
and t dt, we find that the given integral is equivalently 


1 
[jsinud 
2 


Evaluating the integral in u and substituting back to f, 


1 1 1 
[ sine? de = [5 sinudu=—5 cosut C= —5 cost? +C 
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Overall, u-substitution is particularly relevant for working with composite 
functions. In attempting to use u-substitution, we should search the integrand 
for an inside function, and then hope that its derivative (up to a constant 
multiple) is present outside the composite function. 


Examples for further practice: 


1. fier dt 


2. rar —1a dt 


3, [oct dt 


sint 
fo dt 
1+ cos? t 


5. [on t) dt Hint: sin? t = 1 — cos? t. 


Integration by parts 


As u-substitution is used to undo the chain rule, integration by parts undoes 
the product rule. It is particularly applicable to integrals that involve products 
of basic functions such as [ te’ dt. 
Recall that the product rule states 
d 
dt 
Integrating both sides of (A.1), it follows that 


[u(t)v(t)] = u(t)v'(t) + v(t)u'(t) (A.1) 


u(e)v(t)= f uleyv'(eyae+ f v(e)u'(e dt (A.2) 
Solving for A u(t)v’(t) dt, we have 
[uc rde= unis) f von at (A.3) 


Writing dv = v'(t) dt and du = u'(t) du and suppressing the presence of t, we 
see in (A.3) the standard statement of the integration by parts rule: 


/ udv = uv — / vdu (A.4) 


For example, let’s evaluate f te’ dt. Letting u = t and dv = e' dt, we observe that 
du = dt and v = e'. Thus, integrating by parts, 


frtarae'—fedran'—e+c 
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A good way to think of integration by parts is to view it as integrating the 
product u dv by trading u for its derivative and trading dv for its antiderivative. 
In particular, once we have decided to use integration by parts, we must make 
appropriate choices for u and dv. One guideline is that dv should be fairly easy 
to antidifferentiate; another is that the derivative of u should not be significantly 
more complicated than u itself. Overall, we generally want the integral of v du to 
be somehow simpler (or at least not more complicated) than the integral of u dv. 


Examples for further practice: 


1. f inede 
2. f sesinear 
3, fate at 


4. [iviesa 


5. [reat Hint: Try dv = 1. 


6. [ Peat 
7. [ eccoseat 


Partial fractions 


A remarkable fact is that any rational function (that is, any quotient of two 
polynomials) may be integrated. The standard method for approaching an 
integration problem of the form 
t 
; pie) dt 
q(t) 


is the technique known as partial fractions. It is necessary to assume (or apply 
long division so) that the degree of p is less than the degree of q. While partial 
fractions is an important technique for integration, it is also a useful tool in its 
own right. For example, we frequently use it when working with the Laplace 
transform; see sections 5.5 and 5.6. 

The method is best understood through a sequence of examples. 


Example A.1_ Evaluate the integral 


t 
> at ALS 
Pores “ 
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Solution. Factoring the integrand, we can write 
t _ t 
?4+5t+6 (t+2)(t+3) 


(A.6) 


If we view the righthand side as the result of adding two simpler fractions, we can 
make the reasonable assumption that two fractions of the form A/(t+ 2) and 
B/(t +3) had to be combined by getting a common denominator to form (A.6). 
Thus we assume 

t A B 


(+243) F442 743 


(A.7) 


and seek values of A and B which make this relationship hold for all values of t. 
Multiplying both sides of (A.7) by (t + 2)(t +3), we find 


t= A(t+3)+ B(t+2) (A.8) 


Since (A.8) must be valid for every value of t, we can choose t-values that 
make it especially easy to identify A and B. Choosing t = —2, we see that 
—2 = A(—2+4+ 1) =A. Choosing t = —3, it follows —3 = B(—3+ 2), so B= 3. 
Thus, we have determined 

t 2 rm 3 (A.9) 

(E+ 2)(¢4+3) 2° P43 

Having completed the partial fraction decomposition, we can now integrate. In 
particular, 


/ t ie 2 + 3 
+5t+6 t+2 +3 
= —2In(t+2)+3ln(t+3)+C 
The approach of example A.1 works any time the denominator q(t) can 


be written as a product of distinct linear terms. That is, if q(t) = (t — n) 
(t — 1)-++(t— T,), then we can write 


p(t) _ Aj A2 core An 
q(t) t—1n t— 1 t—Tp 
and use algebra similar to our work above to determine Aj, ..., An. 


Example A.2 Evaluate the integral 
—4 
= dt 
| t3 + t? 
Solution. Factoring the denominator of the integrand, we have 


?-4  f-4 
p+ 12(t+1) 
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If we think of the possible simpler fractions from which the given one can arise, 
we see that it is possible for terms of the form 


A B C 


—, =, and —— 
ete t+1 
to be present. In particular, we must include A/t since this denominator is 
included in the necessary B/t?. Thus we write 

?-4 A BC 


= A.10 
t2(t+1) pe gag ( ) 


Multiplying both sides of (A.10) by the least common denominator t?(t + 1), 
we find 


??—4= At(t+1)+B(t+1)+Cr 


Setting t = 0 implies —4 = B; using t = —1 shows —3 = C. To find A, we 
may use any other value of t, along with the established values of B and C. 
With ¢= 1, 


—3 = A(1)(2) + (—4)(2) + (—3)1 
and therefore A = 4. We now apply the partial fractions decomposition and 


integrate: 
t?—4 4 4 3 
—— dt= - dt 
[x (E t2 =) 


=4Int+4t7-!-—3ln(t+1)+C 


In any rational function where the denominator contains a repeated factor, we 
use a similar form of partial fraction decomposition. For instance, 

—2t+1 wig och i Og, Pg F 
(¢+4)3(t-2)2(t-—5) t+4 (t+4)2 (£443 t-2 (t-2)2 t-5 


so that each repeated factor is represented once for each possible order up to the 
highest power. 


Example A.3_ Evaluate the integral 
t—5 
i 7, at 
Poe 
Solution. When we factor the integrand, we observe that a quadratic term is 
present that cannot be factored further. In particular, 
t-5 st? 4 
e+t  t(t2+1) 


In this case, we assume that the right hand fraction may be decomposed into 
the sum 


t—5 A Bt+C 


= A.ll 
t(t2+1) ¢t t+1 eh) 
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The linear term Bt + C in the numerator of the last fraction is necessary; if we 
used only C as the numerator of the second fraction, a contradiction may arise 
in attempting to find A and C. Multiplying both sides of (A.11) by t(t? + 1), 


t—-5=A(t? +1)+t(Bt+C) (A.12) 


Besides t = 0, there are no obvious real values of t that enable us to easily 
deduce the values of A, B, and C. Choosing any three distinct values of t 
will lead to a system of three linear equations in A, B, and C which may be 
solved. Alternatively, we can expand and equate like coefficients in (A.12). 
Specifically, since 

t-5=Ar’+A+B’+Ct 


equating constant terms implies A = —5, equating linear terms shows C = 1, 
and the quadratic terms require that A+ B= 0, thus B= 5. We have now found 
the partial fraction decomposition and are ready to integrate. Doing so, 


t—5 5 5t+1 
——— dt= — dt 
fas I( +5t) 
-| eee eee Ve (A.13) 
~ t +1 41 : 


5 
=—5lnt+ 5 Ince? +1)+arctant+C 


Note that from the first step to (A.13) we performed the key algebraic 
separation 

St+1 St 4 1 

e+1 f4+1 41 


so that we could integrate the first term by u-substitution (u = t? + 1) and 
recognize the integral of the second as the familiar arctangent function. 


When a rational function’s denominator is factored, any time a term of the form 
s? + a’ arises, we must include a linear term in the numerator of the proposed 
partial fraction decomposition. For instance, if we were decomposing 


t 


(s? +.9)(s2 + 25) 
the appropriate form to assume for the sum of simpler fractions would be 
t At+B Ct+D 


= + 

(s?+9)(s? +25) s?+9 52425 
The observations we have made for the cases of distinct linear terms, repeated 
linear terms, and irreducible quadratic terms may be combined, as need, in any 
problem where a partial fraction decomposition is sought. 
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Examples for further practice: 
xt +1 
| dt 
(tf —2)(t— 1)(t+3) 


ep+ttl 
_festly 


a7 t—t—1 7 
“} PB-6t2+11t-6 
t—t—1 
af a 
(t — 3) 
t+2 
5. | ——~dt 
[ 


t 
e 
6 | aacett 


Tables and computer algebra systems 


In addition to the methods of u-substitution, integration by parts, and partial 
fractions, there are other standard integration techniques that enable us to 
deduce a wide range of results. Students normally learn a handful of integration 
techniques in calculus; it is also the case that entire books exist that are filled 
with tables of integrals and almost every calculus book includes at least a short 
table of integrals, typically a few pages long. 

It is common for integral tables to include results such as 


1 1 
[ sinmesinne at = —*——sin(n m)t Ga mA#-+n 


Given an integral that aligns with this form, say 
i sin 5t sin 3t dt 


it is a straightforward exercise to identify m and n and thus evaluate the 
integral. 

In other cases, the identification of the appropriate rule in a table is 
more subtle and involved. In table A.1, we see that for the given collection of 
examples, even a slight change in the integrand leads to a major difference in the 
result. 

In addition, we note that it takes some care in order to correctly identify 
which line in an integral table to use in certain examples. For instance, if we 
wish to evaluate the integral 


dt 
——— A.14) 
| 5tV/4t2 +9 
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Table A.1 
Integrals involving u2 + a2 
Function Antiderivative 
du 1 u 
Oe: — arctan — 
a+u a a 
du 
= Injut+ Vu? + a?| 
wta 
2 
u a 
[vexed 5 u2 + a2 5 inlut vw £a?| 
/ u? du urs , a eet 2| 
——====. —VuuraF—Miu+Vuxa 
Vu? + a? 2 2 
i; du L atv u + a2 
n 
uJ u2 + a2 a u 


/ du 
uv u2 — a2 a a 


it appears that (A.14) most resembles (5) in table A.1. To use this statement 
in the table, it is necessary that we execute a u-substitution. We see that letting 
u=2t implies u* = 4t, t = u/2, and dt = du/2. Replacing the three appearances 
of t in (A.14), we have 


| dt 5 du 1 | du 

star +9 J Sud +9 SS uw +9 

Applying (5) in table A.1 to our most recent result (with a = 3) and then 
substituting back to t, we find 


dt 1 1, |34+Vuw49 
= In +C 
5tV4t274+9 5 ) 
1 34 /4t2+9 
= In +C 
15 2t 


An available option in the consideration of any integral is the use of a computer 
algebra system. In Maple, the syntax > int(f(t), t); results in the 
program attempting to evaluate the integral. For example, 


>int(exp(sqrt(t))/sqrt(t), t); 
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produces the output 


ev" 


which shows that 


alt 
/ dt =2eVF $C 
Jt 
There are integrals that Maple evaluates but that produce unusual output, 
such as 


> int(exp(t*2), t); 


which results in 
1 
, /merf(t) 


The function erf is the so-called error function which arises frequently in 
probability and statistics and is itself defined by a definite integral. The notation 
erf(t) is used since e-" lacks an elementary antiderivative. 

Other integrals, some of which may be evaluated with human intervention, 
Maple is unable to execute. For instance, the integral 


a+ tye’/1+ (tet)? dt (A.15) 


cannot be evaluated by Maple (when entered and executed, the program 
simply returns the integral unevaluated). However, if we recognize that the 
u-substitution u = te’ leads to (A.15) being equivalently expressed as the 


integral 
/ v14+ wu? du 


then we observe that this integral in u may be easily evaluated by Maple or found 
in any standard table. 

Overall, the reader is advised to be well versed in the standard integration 
methods, to practice them as needed, and to realize that even with lengthy tables 
and the availability of computer algebra systems, evaluating integrals if often 
both a challenging and involved task. 
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Complex numbers 


Complex numbers arise naturally in the solution of quadratic (and other 
polynomial) equations. For example, the equation 


?+1=0 
has no real number solutions. But, if we want any quadratic equation to have two 
solutions, it is natural to say that t? = —1 and therefore t = +,/—1. We denote 
/—1 by the symbol “i”, and thus say that t = +i are solutions to t* + 1=0. 


Similarly, if we have the equation t? + 2t +5 = 0 and we apply the quadratic 
formula, it follows 


(2 2tv? = 4-1-5 _ -24 V—16 
~ 2 ~ 2 


Using i = ./—1, we have 


—2+41 : 
i= S12 21 
2 
In general, a complex number z is any number of the form 
z=a-+bi 
where a and b are both real numbers and i satisfies it = —1. Complex 


numbers are naturally represented as points in the so-called complex plane, 
which corresponds to R?; the set of all complex numbers is denoted by C. 
In particular, given any complex number z = a+ bi we can associate z with 
the point (a, b), as shown in figure B.1, where we see the particular example 
zZ=3+421. 

In the complex plane, the horizontal axis is known as the real axis, denoted 
Re, and the vertical axis is the imaginary axis, labelled Im. For the complex 
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Im 342i = (3,2) 


Z| 


Figure B.1| The complex number z = 3 + 2i. 


number z = a+ bi, the real part of z is a, and we write Re(z) = a, while the 
imaginary part of z is b, denoted Im(z) = b. 

This geometric interpretation of complex numbers leads to other natural 
concepts. The modulus |z| of z = 3 + 21 is defined to be the length of the line 
segment from the origin to the point (3, 2), or |z| = /32 + 2? = V13. Similarly, 
to each complex number we associate an angle 0, as shown in figure B.1, which 
is known as the argument of z. For z = 3 + 2i, 6 = arctan2/3. In general, for 
z=atbi, |z| = Va2+b2 and 6 = arctanb/a. The modulus and argument 
essentially give us the polar representation of z, while a and b provide its 
rectangular coordinates. 

Just like with real numbers, we can add, subtract, multiply, and divide 
complex numbers. For example, if w= 2+ i and z = 3 +23, then 

w+z=(2+i1)+(34+21)=5+31 
Complex addition, much like vector addition, is performed component-wise. 
Subtraction is executed in the same manner. For multiplication, the distributive 
law enables us to compute products of complex numbers. Specifically, 
w-z=(2+i)(34+2i)=64+4i4+ 31427 =6471-2=44-7i 
To divide, we use the complex conjugate of the denominator to convert the 
division problem to one of multiplication. The complex conjugate of z= a+ bi 
is Z= a — bi. For instance, 
w 2+i 2+i 3-2i 


z 342i 342i 3—2i 
_ 6431-21-27 44 


9 — 472 ~ 5 
4; 
=-+-1 
55 


Using basic trigonometry and Euler’s formula!, we can gain a particu- 
larly nice geometric perspective on the multiplication of complex numbers. 


! Buler’s formula, e” = cos6 + isin@, is introduced in section 3.5. 
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Im (|2| cos 9, |Z| sin 8) 


[Z| sin 8 


Re 


|Z| cos 8 


Figure B.2 The complex number z = |z|cos@ + 
i|z| sin@. 


Given a complex number z with modulus |z| and argument 6, we may write z 
in its rectangular form as 


z= |z|cos@ + 1|z| sin@ 


as demonstrated in figure B.2. From Euler’s formula, we see that it is equivalent 
to write z in the form 


z= |z|cos@ + 1|z| sin@ 
= |z|(cos@ + isin@) 
= |z\e 


Note particularly that the complex number e’” = cos@ + isin@ has modulus 1; 
that is, e”” lies on the unit circle in the complex plane. 

Given another complex number w with modulus |w| and argument a@, we 
may write w = |wle!”, from which the product w- z is 


w-z=(|wle)- (|zle) = |w||z,e@or® (B.1) 


The expression (B.1) for w-z shows that when two complex numbers are 
multiplied, the modulus of the product is the product of the two numbers’ 
moduli, while the argument of the product is the sum of the arguments of the 
two numbers. This is shown geometrically in figure B.3. 


Finally, it is important to note that because the complex numbers have so 
much in common with the real numbers, it makes sense to work with them 
in functions, too. For instance, we can consider a function such as 


P(z) = 2° — 32° + (5—2i)z? + iz? —21z+3—5i. 


P is a function for which we can input any complex number z; the output 
will also be a complex number P(z). For our work with solving differential 
equations, it will sometimes be the case that we can find a complex solution 
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to a real differential equation, and that certain parts of the complex function 
(in fact, its real and imaginary parts) will themselves be real solutions to the 
differential equation. Our exposure to complex functions will be largely limited 
to doing some algebraic work with them; when studied in depth these functions 
lead to a rich area of mathematics known as complex analysis, where one can 
discover how calculus can be extended from working with real functions to 
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Im 


Figure B.3 The product of 
complex numbers w = |wle’® 
and z = |z\e?”. 


complex ones. 


Examples for further practice: 


1; 


For each complex number, identify its real and imaginary parts, 


determine its complex conjugate, and write the number in the form 
i0 


(a)z=3-2i 
(b)z=—-44+91 
(c)z=5 
(d)z = 4i 


. Evaluate the stated sum, difference, product, or quotient, and write the 


result in the form z= a+ bi. 


(a) (1 —3i) + (4+7i) 
(b) (2—5i) — (10— i) 
(c) (1—2i)i 

(d) (5 — 21)(i— 3) 
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(e) 1+i)(1— 1) 


SS epee 
caer 
(8) 533 


3. For any complex numbers z and w, show that 


(a) z+w=Z+WwandZw=zw 
(b) zz=|z|? =0 

(c) Re(z+w) = Rez + Rew 

(d) Re(zw) = Rez Rew — Imz Imw 
(e) Im(zw) = Imz Rew + Rez Imw 


4. Using the fact that e” = cos@ + isin, determine the real and imaginary 
parts of 
(a) elt 
(b) eit /6 
(c) e231 
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Roots of polynomials 


Polynomials are the most basic functions in all of mathematics. A real polynomial 
of degree n is a function p(t) of the form 


p(t) = ant” + a,1t" | +--+ at +a (C.1) 


where ao, ..., @, are real numbers. A number r is a root or zero of a polynomial 
p if and only if p(r) = 0. In addition, we note that r is a root of p if and only 
if (tf — r) is a factor of p(t), which means that we can express p(t) in the form 
p(t) =(t—1)q(t), where q is a polynomial of degree one less than p. 

The roots of polynomials find important applications in many settings; in 
our study of differential equations and linear algebra, we must find polynomial 
zeros when solving the eigenvalue problem, as well as when determining 
fundamental solutions to higher order linear differential equations and linear 
systems of DEs. Here we briefly review some of the most important facts about 
the zeros of polynomial functions. 

From quadratic polynomials of the form p(t) = at? + bt + c, we know that 
there are three possibilities for the zeros: p may have two distinct real zeros, 
one repeated real zero, or no real zeros. This can be observed in a variety of 
ways, but a graphical perspective is compelling: if a quadratic function p opens 
upward (that is, its coefficient a > 0), then the function either its vertex lies 
above the t-axis, on the t-axis, or below the t-axis, thus leading to the three 
noted possibilities, as shown in figure C.1. 

We can see the three cases from an algebraic perspective as well. From the 
quadratic formula, we know the zeros are given by 


fe —b+ Jb? —4ac 


v (C.2) 
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Figure C.1 Three concave up quadratic functions whose vertices lie, 
respectively, below, on, and above the t-axis. 


Figure C.2 Four cubic polynomials that demonstrate possible arrangements of the 
zeros of cubic functions. 


Thus, if b? — 4ac > 0, it follows that p(t) has two distinct real roots. In the 
case that b? — 4ac = 0, p(t) has one repeated real root; here we say that p(t) 
has a root of multiplicity 2. Finally, if b? — 4ac < 0, then although the term 
<b? — 4ac permits no real solutions, if we use complex numbers and write 
Jb? — 4ac = iv/4ac — b?, we find that p(t) has two distinct complex roots. 
Note from (C.2) that these two complex roots are complex conjugates of one 
another; more on complex numbers can be found in appendix B. 

The factored form of quadratic polynomials is also important. If p(t) has 
two real roots, say t = —1 and t = 1, then p(t) can be written in the form 
p(t) = a(t +1)(t— 1), where p(t) is the product of two real linear terms. If p(t) 
has a repeated root, say t = 1, then we have p(t) = a(t — 1)’. Finally, if p(t) has 
complex roots, then p(t) cannot be factored into a product of real linear terms. 
For example, p(t) = t* +1 is a quadratic function with roots t = +i; we say that 
the quadratic term ft? + 1 is irreducible. 

For polynomials of degree greater than 2, many similar properties hold. 
For example, for polynomials of degree 3, we can see graphically several 
possibilities in figure C.2. In particular, a cubic polynomial can have a single 
real, repeated root of multiplicity 3, such as the function p(t) = (t — 1)? shown 
at left in figure C.2. Alternatively, it is possible for the function to have algebraic 
form p(t) = (t — 1)?(t + 1), which leads to two real roots, one of which has 
multiplicity 2, which corresponds to the left center function in figure C.2. 
Likewise, a cubic function such as p(t) = t(t — 1)(t + 1) can have three distinct 
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real zeros—see the right center graph in the figure—or have only a single real 
root (which leaves the remaining two roots to be complex) as shown in the 
right-most graph in figure C.2. 

Because a cubic function will have one end tend to +00 and the other 
to —oo, this guarantees that every cubic function will have at least one real 
zero. It follows that we can write p in the form p(t) = (t — r)q(t) where q is 
quadratic, and from this we can deduce the four possible cases for the zeros 
of p discussed in the preceding paragraph. In fact, there even exists a cubic 
formula analogous to the quadratic formula that explicitly provides the zeros of 
p(t) = at? + bt? + ct +d in terms of formulas involving the coefficients a, b, 
c, and d. This formula is sufficiently complicated that we choose not to state 
it here. 

The patterns we have observed for quadratic and cubic polynomials can 
be proved to hold for real polynomials of any degree. In particular, we have 
seen so far that for any degree-2 polynomial, the function has two zeros 
provided we allow them to be complex and count them according to their 
multiplicity. Similarly, for any degree-3 polynomial, the function has exactly 
three zeros under the same proviso. The Fundamental Theorem of Algebra, 
first proved by Carl Friedrich Gauss in 1799, beautifully summarizes the 
situation. 


Theorem C.1 (The Fundamental Theorem of Algebra) If p(t) is a real 
polynomial of degree n, then p(t) has exactly n zeros provided we include 
complex zeros and count all zeros according to their multiplicity. 


Theorem C.1 can be proved using methods of complex analysis. Its 
main purpose for our work is that we are always guaranteed that n roots 
of a polynomial of degree n exist. Through the methods established in 
chapters 3 and 4 for dealing with complex and repeated roots of characteristic 
equations, the Fundamental Theorem of Algebra ultimately enables us to find 
all solutions to any homogeneous linear higher order DE or system of linear 
first-order DEs. 

We also note that it is possible to use standard ideas in complex analysis to 
show that ifr isa complex root ofa real polynomial p, then its complex conjugate 
T is also a root of p. This guarantees that for real polynomials, complex roots 
will always appear in conjugate pairs, just as we saw for the case of quadratic 
functions. 

While the Fundamental Theorem of Algebra guarantees the existence of 
n zeros to a polynomial of degree n, it unfortunately does not provide an 
algorithm for finding them. In fact, though formulas exist for quadratic and 
cubic equations, as well as the degree four case, mathematicians have shown 
that there exists no general formula to provide the roots of a polynomial of 
degree 5 or greater. For higher degree polynomial equations, this leads us to 
resort to numerical methods or computer algebra systems; see section 4.6.1 for 
more on how to use Maple to compute the roots of polynomial functions. 
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Examples for further practice: 


1. For each of the following polynomial functions, state the degree, 
determine all of the zeros, and state the multiplicity of each zero. 


(a) p(t) = t(¢+2)(t +5) (t—3)(t-z)(?? +1) 


(b) p(t))=t*-1 

(c) p(t)=t*+1 

(d) p(t) = (#7 +13 (¢ — 3)9(#? — #12) 
(e) p(t) =? +61? +9t 


2. Determine a formula for a real polynomial function of the least possible 
degree that satisfies the given criteria. State the degree of the function you 
find. If no such function is possible, explain why. 


(a) distinct zeros at t= —3, —1, 2 and a zero of multiplicity 3 at t = 0 

(b) complex zeros t = +31, each of multiplicity 2, and a single real zero of 
multiplicity 1 at t= 4 

(c) a zero of multiplicity 2 at t = —1, a zero of multiplicity 3 at t = 2, and 
a zero of multiplicity 4 at t =5 

(d) a polynomial of even degree with exactly one real zero of multiplicity 1 
att=0 


Linear transformations 


The notion of function is central to mathematics. Given any two collections of 
objects A and B, a function f : A— Bisa rule that associates each element of A 
with one and only one element of B. Sometimes, we use the terms mapping or 
transformation in place of the word function. Among all functions, certain types 
stand out for their important properties and/or simplicity. In what follows, we 
focus on the property of linearity. 

In many different areas of our study of linear algebra and differential 
equations, we find that linear combinations of objects play a key role. Similarly, 
we encounter important functions that transform a certain group of objects 
into another collection. The combination of these ideas makes us naturally 
interested in transformations that preserve linear combinations. Let us consider 
three familiar examples. 


(1) For any m x n matrix A, any vectors x and y in R”, and any real number c, 
A(x+y)=Ax+Ay and A(cx) = cAx 

(2) From calculus, if we let D denote the differential operator, then for any 
differentiable functions f and g and any real number c, we know by the 
sum and constant multiple rules that 

Df +g)=D(f)+D(g) and D(cf) = cD(f) 

(3) In our studies of the Laplace transform CL in chapter 5, we found that the 
transform satisfies the property that for any acceptable functions f and g 
and any real constant c, 

Lif (t) + g(t)] = Lif] + Lig (t)] and Lief (t)] = cLif(t)] 
Matrix—vector multiplication, differentiation, and the Laplace transform are all 
examples of transformations: they take a given input (a vector or a function) 
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and transform that input to a (unique) output, a new vector or function. 
Moreover, each satisfies the property that the transformation preserves sums 
and scalar multiples: the transformation applied to a sum is the same as the 
sum of the results of the transformation applied to the individual objects, and 
the transformation applied to a scalar multiple of an input is identical to the 
same scalar multiple of the output that results from the transformation applied 
to the original object. Viewing these inputs as belonging to a vector space!, 


we arrive at the following formal definition. 


Definition D.1 Let U and V be vector spaces. A transformation T: U > V 
is a linear transformation provided that for any vectors u and v in U and any 
scalar c, T satisfies the properties T(u++-v) = T(u)+ T(v) and T(cu) = cT(u). 


Two consequences of the definition are immediate: T(0) = 0and T(au+ bv) 
= aT(u) + bT(v) for all scalars a, b and vectors u, v. Note that in the equation 
T (0) =0, the zero vector on the left is from U while the one on the right is from 
V, and thus these may not be the same zero vectors. 

Linear transformations play a key structural role in linear algebra and in 
the theory of linear DEs. We first turn to a discussion of the matrix of a linear 
transformation of finite dimensional vector spaces. 


Matrix transformations 


In section 1.3, we first saw that Property (1) above holds for matrix—vector 
multiplication. That is, given an m x n matrix A, for any two vectors x and y in 
R” and any constant c, 


A(x+y)=Ax-+Ay and A(cx) = cx 


Thus, if we define the transformation T : R"” — R™ by the rule T(x) = Ax, 
then it follows immediately that T(x +y) = T(x) + T(y) and T(cx) = cT(x), 
which means that T is a linear transformation. Said differently, the natural 
multiplication function associated with a given matrix A always generates a linear 
transformation. We usually call A the matrix of the transformation T. Consider 
the following particular example. 


: = ) and let T(x) = Ax. Determine T(e,), 


T(ez), and T(e3) where {e;, e2, e3} is the standard basis of R*, and then use 
properties of linearity to determine T(z) when z = [—5 2 — 6]. 


Example D.1 Let A= E 


Solution. First, we observe that 
1 
3 —2 5 3 
rle))=Ae=|_} 0 | 0 = 27 (D.1) 


! This appendix assumes that the reader is familiar with basic concepts in sections 1.11 and 1.12. 
If the Laplace transform has not yet been studied, references to it may simply be skipped. 
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Similarly, 
—2 5 
T(e2) = | and T(e3)= i (D.2) 
Next, to compute T(z), we observe that 
—5 1 0 0 
T= 2)}=-5]0]+2) 1)]—6] 0 | =—5e, + 2e2 — 6e3 
—6 0 0 1 


and thus by the linearity of T and (D.1) and (D.2), we have 
T(z) = T(—5e, + 2e2 — 6e3) 
= —5T(e1) +2T(e2) — 6T(e3) 


= a}e[a} 4s] 


There are at least two important observations to make from example D.1. The 
first is that, due to linearity, we can find the result of applying T to any vector 
if we first know the results of applying T to the basis vectors in the domain 
of T. Since any vector in the domain can be uniquely expressed as a linear 
combination of basis elements and T preserves linear combinations, we can 
easily apply T to the linear combination that generates the vector of our choice. 
This holds not just for the transformation in the example, but indeed for any 
linear transformation on a vector space. 

Furthermore, (D.1) and (D.2) indicate that there is a key relationship 
between the values of the transformation applied to the domain’s basis vectors 
and the matrix of the transformation. Specifically, T(e;) is the first column of 
A, and T(e2) and T(e3) are the second and third columns of A. That this result 
holds in general is the following theorem. 


Theorem D.1 If T : R"” > R” is a linear transformation, then T(x) = Ax 
where A is the m x n matrix 


A=[T(e,) T(e2) --- T(en)] 


and e; is the jth standard basis vector of R”. Moreover, the matrix A is unique. 


Example D.2 Let T : R* > R? be a linear transformation such that 


—2 4 
T(e,) = 3 | and T(e.)= | —2 
9 0 


Determine the matrix A of the transformation T and use A to compute T(z) 
where z=[—3 —2]!. 
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Solution. By theorem D.1, it follows that 


—2 4 
T(x) = Ax= 3 —2)|x 
9 0 
Thus, we can compute T(z) as 
2 4)7_, -2 
T(z) =Az= 3 -2 |-3|- —5 
9 0 —27 


Linear differential equations 


In chapter 4, we solve higher order linear differential equations with constant 
coefficients of the form 


y + any") +--+ ay! + ay = f(t) (D.3) 


In this setting, we can take a sophisticated perspective through linearity to see 
how solving an equation such as 


y” +2y' +3y =0 
is very similar to solving the homogeneous linear system of algebraic equations 
given by Ax = 0 where A is an m x n matrix. 


Recall that the derivative operator, D, is linear. The same is true of the 
second derivative operator, D”, since 


D(f+g)=(f+g)" =f" +2" =D°F)+D°(g) 
and D?(cf) = (cf) = cf” = cD*(f). This alternate notation for derivatives 


permits a new perspective on DEs. Consider that y” + 2y’ + 3y = 0 can now be 
expressed as 

D’(y)+2D(y) + 3y =0 (D.4) 
In this setting, we observe that the left side of (D.4) appears as if a function 


or process is being applied to the input y. If we let L be the transformation 
defined by 


L(y) = D*(y) + 2D(y) + 3y 
then we see that (D.4) can be written equivalently as the equation 
L(y) =0 
Moreover, this new transformation L is linear. Observe that 
Lif +g) = D'(f+g)+2D(f +g) +3(f +g) 
= D?(f)+D*(g) + 2D(f) +2D(g) + 3f +3g 
= D?(f)+2D(f) + 3f + D?(g) + 2D(g) +3g 
= L(f) + L(g) 
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Similarly, it is straightforward to show that for any constant c, L(cf) = D? (of y+ 
2D(cf) +3(cf) = cD?(f) + 2cD(f) + 3cf = cL(f). Hence, we see that solving the 
second-order equation (D.4) is equivalent to solving the linear homogeneous 
equation L(y) = 0, where L is the linear transformation just discussed. More 
generally, solving equations of the form (D.3) is equivalent to solving the linear 
equation 


Liy)=f 


where L is the linear transformation defined by L(y) = D"(y) + an_1D"!(y) + 
--+ + a,D(y) + agy. While this perspective does not contribute substantially 
to our methods for solving such equations, it does further emphasize why 
these equations are classified as linear and why the characteristic polynomial 
r+ dy_ir" | +--+ ayrt ap arises so naturally. 

Furthermore, the linearity of differential equations of form (D.3) together 
with the fact that the Laplace transform is a linear operator is part of what 
enables the Laplace transform to be such an effective tool. For example, 
to solve 


y" +2y' + 3y = 6(t — 3) 
we take the Laplace transform of both sides of the equation to find 
Ly" +2y! + 3y] = L(t — 3)] 
and thus by linearity 
Liy"| + 2L[y']+ 3L1y] = LI8(t — 3)] 


From there, properties of the transform discussed in sections 5.3 and 5.4 enable 
us to proceed to where we only need to use the inverse Laplace transform to 
solve the equation, which brings us to yet another class of important linear 
transformations. 


Invertible transformations 


A function or transformation T : U > V is invertible provided that there exists 
a function T~! : V > U that satisfies the properties that 


T ![T(u)] =u for all ue U and T[T~!(v)] =v forallv e V 


Equivalently, in order for T to be invertible, there must exist a function Ts 
that when composed with T results in the appropriate identity mapping: T~! o 
T =Iy and ToT! = ly, where Iy(u) = u for every u € U. Loosely, the 
transformation T is invertible whenever there exists a function T~! that reverses 
the work of T. 

Any time a matrix A is invertible, the resulting matrix transformation 
T(x) = Ax is an invertible transformation. Consider the following example. 
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Example D.3 Let A= Ee i and let T(x) = Ax. Show that T is an 


2 -1 


invertible transformation and determine a formula for T~!. 


Solution. We first observe that since det(A) = 1 £0, the matrix A is invertible. 
In addition, we can compute A7! according to the standard algorithm, finding 
via row-reduction that 


3 2 1 07 _,]1 0 -1 —2 
—2 -10 1 O01 2 3 


Thus, the inverse of A is 


Letting T~'(x) = A~!x, it follows that T~!(T(x)) = A~!(Ax) = Ix = x and 
T(T~!(x)) = A(A7!x) = Ix = x, which demonstrates that T is invertible and 
its inverse is given by the formula 


T (x)= rE 3h 


Invertible matrix transformations find many important applications, 
including a prominent role in computer graphics. When matrix transformations 
are used to move a graphical image in a particular way, the inverse transforma- 
tion is needed to move the object back. More on such transformations can 
be studied in section 1.8.1 and in the project found at the end of chapter 1 
in 1.13.1. 

In the study of differential equations, two other invertible linear transfor- 
mations are important. One is found in the integral operator 


s(f(x)) = [ f(t) dt 


which is closely linked to the differential operator, D(f(x)) = f’(x). Specifically, 
since a typical differential equation involves an unknown function and one or 
more of its derivatives, a natural approach is to attempt to integrate. In fact, for 
first-order equations that are separable, integration is the standard approach; 
with some care, integration also works well for linear first-order equations as 
well as exact equations. In these approaches, as well as in others used to solve 
differential equations, we use the fact that integration essentially reverses the 
process of differentiation. Here, we take care to be more precise about this fact. 

Let U be the vector space of all continuously differentiable functions f such 
that f (0) = 0, and V the vector space of all continuous functions.? Then, we see 


2 The choices of U and V can be made considerably broader; doing so involves some subtleties from 
real analysis that are beyond the scope of this course. See, for instance, Real Analysis, by Bruckner, 
Bruckner, and Thomson, 1996, for a discussion on which functions have the property that they are 
differentiable and their derivative is integrable, as well as which functions are integrable. 
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that D: U > Vand S: V > U. Furthermore, for any f in U, 


s(D(f)) = S(f") = [ f(a = f(x) —f 0) = f(x) 


since f (0) = 0, and for any function g in V, 


D(S(g)) =D ( / g(t) ir) 9/5) 


by the Fundamental Theorem of Calculus. In each case, S(D(f)) = f and 
D(S(g)) = g for all relevant functions. This shows that D and S are each 
invertible transformations, and moreover that they are each other’s respective 
inverses. Moreover, as we have noted on several occasions and is studied in 
calculus, both D and S are linear transformations. 

Finally, the Laplace transform is a key example of an invertible linear 
transformation, and its invertibility ultimately is what makes it such a useful 
tool in the solution of linear differential equations. To emphasize several of 
the important properties, we consider an example of a fundamental initial- 
value problem and discuss the role of the Laplace transform in its solution. 
Specifically, we examine the role of the Laplace transform in the solution of 
the IVP 


y" +3y'+2y=0, y(0)=1, y'(0)=-1 


First, recall that £ is a linear transformation on the vector space of acceptable 
functions and that £ transforms a given acceptable function y(t) to a new 
function Y(s). If we now apply the transform to both sides of the differential 
equation, the linearity of £ implies that 


Lily") + 3L[y']+ 2L[y] = 0 (D.5) 


From properties of £ developed in chapter 5, we know that L[y”] = s?L[y] — 
sy(0) — y'(0) and L[y’] = s£[y] — y(0). Therefore, (D.5) can be updated to the 
equation 

?Lly]+s—143(s£[y]-1)+2L[y]=0 (D.6) 


Observe that (D.6) is now an algebraic (rather than differential) equation in 
Y(s) = LLy(t)]. Moreover, whereas before the equation we were trying to solve 
was a differential equation with three unknowns (y, y’, and y”), now there is 
only one unknown, L[y], in (D.6). Solving for £[y], we find 


LLy\(s? + 35+ 2) =4-s 
and therefore 
4-—s 
s?+2s+3 


At this point, the natural remaining step to solve for y becomes evident. Since 
L£ is an invertible transformation, £~'[L[y]] = y, and thus we want to take the 
inverse Laplace transform of both sides of (D.7). One key computation must 


Liyl= (D.7) 
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be performed first, as it turns out that a different algebraic form of the right- 
hand side is useful. A partial fraction decomposition of (4 — s)/(s* + 2s + 3) 
reveals that (D.7) can be equivalently expressed as 

5 6 


Now we are ready to use the inverse Laplace transform; it, like the transform 
itself, is linear, and thus we find that 


cleyn=c7 Ee “ =| (D9) 
and therefore 
att i “acd 
y=5£ | a|-# Fa (D.10) 


A standard fact about the Laplace transform is that for any real number a, 
Le] = 1/(s — a). From this, (D.10) implies that 


y=5e '—6e 7 


which is the solution to the original initial-value problem. 

As we have noted throughout our discussion, the Laplace transform’s 
linearity and invertibility play essential roles in the application of this tool 
to initial-value problems. These fundamental ideas demonstrate the valuable 
nature of the properties of linearity and invertibility, not just with the Laplace 
transform, but indeed in any setting. 


Examples for further practice: 


1. For the given linear transformation T from R” to R”, find the matrix 
of the transformation T, and hence compute T(z), where z is the given 
vector 


(a) T : R* > R? with the property that T(e,) = [1 —3 4]" and 
T(e2) =[—2 1 0]';z=[3 —2]. 

(b) T : R® — R? with the property that T(e,) = [—2 — 1]", 
T(e2) = [5 aE and T(e3) = [3 4]!;z=[6 — 1 3]. 

(c) T: R* > R? with the property that T(e,) = [7 5]' and 
T(ey) =[—11 3]';z=[3 —2]. 


2. Let T : P? > R? bea linear mapping such that 


1 0 —3 
T(t?)=| O|, T(t)=| —2 |, andTO)=] 4 
—1 1 0 


Determine T(3t? — 4t +7). (Recall that the standard basis of P» is 
{Lee} 


iss) 


aw 


uo 


ion 
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. For each given linear transformation T below, find the matrix A of the 


transformation. 
(a) T(x, y) = (2Qx+y, -—3x+2y) 


(b) T(x, y,z) = (x+y —2z, —x+ 2y + 3z) 
(c) T(x, y) = (—x+4y,x—2y,3x+7y) 


. Let D denote the differential operator and D? the second derivative. Use 


this notation to recast the following differential equations as equations 
involving linear transformations, as shown in (D.4). 


(a) y’ —6y’ +5y=0 
(b) y’+4y =0 
(c) y'+5y = 10 


. Again, let D denote the differential operator. Let L(y) = D?(y)+ 


5D(y) +4y. Show that L is a linear operator. In addition, find all 
polynomial solutions to the equation L(y) = 2t + 3. 


. For each linear transformation T given below, determine whether or not 


the transformation is invertible and, if so, find a formula for its inverse. 


(a) T(x, y) = (Qx+y, -—3x+2y) 

(b) T(x, y) = (2x+y, —4x —2y) 

(c) T : R* > R? with the property that T(e;) = [7 5]" and 
T(e:) =[-11 3]? 

(d) T: R* > R? with the property that T(e,) = [7 —5]' and 
T (ex) =[—14 10]? 

(e) T is the mapping that takes each point (x, y) in the plane and reflects 
the point in the line y = x. 


(f) T is the mapping that rotates each point (x, y) in the plane by 90° 
counterclockwise about the origin. 
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Solutions to selected exercises 


Section 1.2 


1. The unique solution to the system is (—1, 1). 
3. The system has no solution. 
5. The system is consistent with unique solution (4, —2, 3). 


7. The system is consistent with infinitely many solutions given parametrically 
by (—3—2t, -2-¢,t), teER. 


9. The system is consistent with infinitely many solutions given parametrically 
by (—1+4+ 2t —4s, t,2—3s,s,—5), t,sER. 


11. No solution exists. 


13. There are infinitely many solutions given parametrically by (1 — 19t,s,1+ 
4t,t), s,teR. 


15. The system is consistent if h = —21 and inconsistent otherwise. 

17. The system is consistent for all values of h; if h 4 0, the solution is unique. 
19. The system is consistent with unique solution (53/3, —8/3, —46/3). 

21. The system is consistent with infinitely many solutions given parametrically 
by (5/3 — 1/6t, -13/3+5/6t,t), te R. 

23. The system is consistent with infinitely many solutions given parametrically 
by (19/2 — 9t, —5/2+17/4t, 2—3/2t, t), teR. 

25. No. 

27. Yes. (2,1, 2). 

29. Yes. Consider the system x1 + x2 + x3 = 0, x} +x2 +23 = 1. 


523 
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31. The number of pivot columns must equal the number of variables, so that 
no free variables are present. 


33.4= a1? + a1 +a, 7 = a2? + a2 + ao, 6 = an3* +. 43 + ay, $0 ay = —3, 
a, = 9, and ap = —2. 


35. I; = 10/41, b = 80/41, and Is =70/41. 


Section 1.3 


1. The product is not defined. 
3,.Ax=[19 5 —13]". 


5. To get each entry in Ax, we take the dot product of the corresponding row 
in A with the column vector x. 


7 [4] [1/20 1/80 ]f], [2250 
“Tx | | 1/40 —1/40 }} xp 3750 |" 
9. Yes, b is a linear combination of the vectors aj, a2, a3; infinitely many weights 


work. For example, x; = 3, x2 = 1, x3 = 0. 


11. The system has infinitely many solutions, so b is a linear combination of the 
columns of A, and can be written as such a linear combination with infinitely 
many different possible weights (x1, x2, x3). Each pair of weights is of the form 
(-—3-—t,5+t,t). 


13. The system has no solution, so b is not a linear combination of the columns 
of A. 


—1l1 

17. The system has infinitely many solutions of the form (—t, f, t). 
19. The system has infinitely many solutions of the form (—t/3, t). 
21. The system has the unique solution x; = x2 = x3 = 0. 

23. All vectors b =[b, b>]! whose entries satisfy by = b2/2. 

25. (a) F; (b) T; (c) T; (d) B (e) F. 


27. x) = [94.40 70.40 75.20], x?) = [89.52 79.50 70.98]7, x® = 
[85.27 87.47 67.26]!. 


Section 1.4 


1. Infinitely many solutions, each of the form (2t/11,8t/11,t). Thus the 
solution set is the span of the vector [2/11, 8/11, 1]". 


3. The span of the vector [8/5, 1]". 
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5. Ax = 0 has only the trivial solution. 


7. Because A has more columns than rows, A cannot have a pivot in every 
column. Therefore, free variables must be present when [A | 0] is row-reduced 
and nontrivial solutions exist. 


9. b is not in the span of the given vectors. 


11. Yes, using the weights x) = —2, x,» = 1, x3 = 4. 


13. W is a plane through the origin in R? that contains the given vectors vj 
and vp. 


15. If (x1, x2) satisfies 2x; — 3x2 = 0, then x; = 3x2/2, so that the vector x = 
[x1 x9]? is a scalar multiple of the vector [3 af, Hence, each point on the line 
lies in Span{[3 2]"}. 

17. (a) T; (b) F; (c) F; (d) F; (e) F. 


Section 1.5 


1. Ax = b is consistent for every b € R? since A has a pivot in both rows. 
3. Ax = b is consistent for every b € R? since A has a pivot in both rows. 


5. Ax = b is consistent for every b € R° since A has a pivot in all three rows. 


7. Ax = b is not consistent for every b € R* since A does not have a pivot in 
row 4. 


9. No. Because A has more rows than columns, it is impossible for A to have a 
pivot in every row. 


11. b is a linear combination of the columns of A with weights 1/5, —6/5. 


13. b is a linear combination of the columns of A; infinitely many different 
weights are possible: one triple of such weights for the respective columns is 
(6, —2, 0). 


15. b is a linear combination of the columns of A with weights x, = —35/11, 
x = 1/11, 3 =7/11. 


17.x=x3/111]!. 

19. x = x)[8/5 1)". 

21.x = 23[-111]!. 

23. x =Xp+xn=[-1110]' +24[5 —3 —11]"). 
25.X =Xp +x, =[2 —3/23/2]"]+[000]". 

27. Ax = b is always consistent. 


29. Impossible. A can’t have a pivot in all three rows. 


31. Ax = b will always be consistent. Since the described system has one free 
variable present, there is one non-pivot column in A. Since A has four columns, 
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A must have three pivot columnns and thus three pivot rows. Because of the 
free variable, every equation Ax = b will have infinitely many solutions. 


Ba. (a) Feb) Bic) "Ted Fs ie) Taf) FB 
35. y= Yn + Vp = Ce? — 6/5. 


Section 1.6 


. Sis linearly dependent. 

. Sis linearly independent. 

. Sis linearly dependent. 

. Sis linearly dependent. 

. (1) no; (2) yes; (3) yes; (4) yes; (5) no; (6) yes; (7) no; (8) yes. 


1 
3 
5 
vi 
9 
1. Not necessarily either. 


— 


13. S may or may not span R?*. S$ cannot be linearly independent. 
15. Given any nonzero vector v, the zero vector may be written 0 = Ov. 


19. {v1, V2, v3} linearly independent for all real numbers k except k = 17/7. If 
k = 17/7, v3 in the span of {v1, v2}. 


21. The columns of A are linearly dependent; the columns of A span R*. Both 
hold because there are four pivot columns in this 4 x 7 matrix. 


23. (a) F; (b) T; (c) F; (d) F 
25. cy = —2 and m = 4. 


Section 1.7 
-1 13 _ 10 —4 
1. (a)B+C= 1 11 |; (b) A+B is undefined; (c) —2A = 
ei: fai 8 
38 -—18 _34 —29 —28 80 —52 
(d) —3B+4C= |} —10 —33 :(¢ AB=[ 28 53 | BA —5 45 —40 
17 —10 —7 5 
i — = 
(g) AA is undefined; (h) A(B-+€) = | 10 ce | ( CA= —3 
10 . ‘i 
—3 9 
(Gj) C(A +B) is undefined; (k) A'+B= | -3 16]; (1) (B+C)! 
—1-6 
0 6-52 
=I 1-1], to _ | —38-6]. T = : 
Ee ll = a) oe = 35 Pal i) BO = ee : : 
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=28 =5=7 

—34 28 
(0) (AB)? = Be ma (BA)'=| 80 45 5 
—52 40 2 


3. Two square matrices of the same size can always be multiplied, and in either 
order. Non-square matrices can only be multiplied in both orders (AB and BA) 
when one is m x n and the other is n x m. Note that when A and B are not 
square, AB never equals BA. 


1-2 —3 —4 
a5 5 | a4 


7.B= ee § |: Note that BA = AB, 


9.B= k | |: Note that BA = AB, 


11. (a) No; (b) No; (c) 1; (d) No familiar one; (e) No such matrix exists. 
13. See 1.) above. 


Section 1.8 


1—L 
ql 2 , 
1A Ee | 


3. A~! does not exist. 
5. A~! does not exist. 


7. Ax = b; and Ax = by each have infinitely many solutions, while Ax = bs; 
has no solution. We see that A is not invertible. 


9. Multiplying A by E on the left switches rows 2 and 3 in A. 
11. Multiplying A by E on the left switches multiplies row 2 by c. 


13. Multiplying A by E on the left switches replaces row 3 with row 3 plus a 
times row 1. 


15. AT! = Al, 

17. (AB)-! = B7!A7!, 

19. Suppose that both B and C are inverses of A. Then AB = I = AC. Since A is 
invertible, we can multiply on the left by A~!, from which it follows that B = C. 


1 1 
21. Yes: A= E ah 


234° = PDP". 
25. Every point is rotated 60° counterclockwise. 
27. Every point is rotated 90° counterclockwise. 
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_ ff 2/5 -3/10 
29.C= Be | 


31. Apply the inverse of the Markov matrix to the current population. 
33. (a) F; (b) Ts (c) F; (d) F; (e) Ts (f) B (g) Ts (h) F. 


Section 1.9 


1. det(A) = 2 £0 s0 A is invertible. 

3. det(A) = —28 £0 so A is invertible. 

5. det(A) = 252 4 0 so A is invertible. 

7. det(I,) = 1; clearly I, is invertible. 

9. The matrix is invertible for all real numbers z except z = 1, 3. 
11. det(AB) = det(A) - det(B). 


13. Since AA~! = I, we have det(AA~!) = det(I). Now use the property of 
determinants from Exercise 11 and solve for det(A). 


15. det(A) = 0 since the columns are a linearly dependent set; equivalently, 
det(A') = 0 since the rows of A are a linearly dependent set. 


17. If A? is not invertible, then A is not invertible, since det(A”) = det(AA) = 
det(A) det(A), so det(A) = 0 if and only if det(A”) = 0. 


d—b 
-l1 1 


Section 1.10 


1.4 = 5,3 with corresponding eigenvectors [1 ott 31, 

3. A does not have any real eigenvalues or eigenvectors. Its eigenvalues are 
A=-1+21. 

5. A = 2 with corresponding eigenvector [1 0 0]. 

7. = 2 with corresponding linearly independent eigenvectors [1 0 0]', 
[0 1 0]'; A =0 with corresponding eigenvector [0 0 if’. 

9. Ax =[5 20]'. 
11. (a) A = —3, —3, 0 with corresponding eigenvectors [—1 1 0], [—-1 0 1]% 
[1 1 1]*; (b) Yes. 
13. (a) A = 5,2,2 with corresponding eigenvectors [1 — 1 if ett Tt oF 
[—1 0 1]'; (b) The columns of P are linearly independent; use the Invertible 
Matrix Theorem; (c) AP = PD; (d) A!? = PD!°p-!. 
15. Hint: det(B— AI) = det(PAP~! — AI) = det(PAP~! — APIP~!) = det(P(A— 
AI)P—!) = det(P) det(A — AI) det(P~!). 
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17. D(e™) = r- e™, so taking the derivative only stretches e™ by a factor of r. 
Thus, e™ is like an eigenvector with eigenvalue r. 


19. Yes; v © [192.41 139.43 77.71]! (in millions). 
21. (a) F; (b) T; (c) F; (d) T. 


Section 1.11 


. H is not a subspace. Consider multiplying [1 1] by a negative scalar. 


. H is a subspace. 


1 

3 

5. H is a subspace. 
7. H is not a subspace, since the zero vector does not belong to H. 
9. H is not a subspace; the zero matrix is not invertible. 

ll 


13. H is a subspace. 


. H is a subspace. 


15. H is a subspace. 

17. For A = 1, the corresponding eigenspace is the set of all scalar multiples of 
the eigenvector [1 1]'. For A = 3, the corresponding eigenspace is the set of all 
scalar multiples of the eigenvector [1 —1]". 

19. Because the span of a set is the set of all linear combinations of a given 
collection of vectors, we can always make the zero combination to get the zero 
vector. In addition, because any linear combination is allowed, the span of a set 
of vectors must be closed under scalar multiplication and closed under addition. 
21. H is not a subspace of R> because no values of a and b can be chosen to 
form the zero vector in H. 

23. Col(A) is the set of all linear combinations of the columns of A, which is 
equivalently the span of the columns of A. By exercise 19, it follows that Col(A) 
is a subspace. 

25.v =[—2 1 1]! is not in Col(A); u=[—1 4 —4]! is in Col(A); Col(A) is 
the span of {[1 3 — 4], [—2 1 0]"} 

27. Col(A), because it is simply the span of the columns of the given matrix. 
29. Verify by direct substitution that y = Ce*’ + 1 is a solution to the equation. 
This set of all such solutions is not a subspace because the zero function is not a 
solution to the DE. 


Section 1.12 
1. A basis for H is {[2 0 —1]"} and therefore H is one-dimensional. 
3. A basis for H is H ={[2 1 —3 1]',[3 —4 2 —1]"}, so H is two-dimensional. 
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5. A basis for H is H = {[1/2 1]"}; H is one-dimensional. 


: : 10 00 00 : . : 
7. A basis for H is iE | ’ I 4 ‘ F || sos three dimensional. 


9. Yes, since S is a linearly independent spanning set in R?. 
11. No, since S is not linearly independent. 
13. No, a set with fewer than 4 vectors cannot span R*. 
15. The vector space P of all polynomial functions is an infinite dimen- 
sional vector space because its basis has to include every power function: 
1, t, t7,09,..., £19, ..,, ¢!00000 = Therefore, the basis cannot have a finite 
number of elements. 
17. dim(Nul(A)) + dim(Col(A)) = n since the dimension of the column space 
of A is the number of pivot columns of A and the dimension of the null space 
of A is the number of non-pivot columns of A. 


Section 2.2 


LiaeZcjrer 

3. A(t) = 100 is an equilibrium solution because it is a constant function 
that makes the DE true; this solution is a stable equilibrium, as seen from the 
direction field. 


5. The direction field should show an unstable equilibrium at P = 0 and a 
stable equilibrium at P = 25 all solutions with initial values greater than 0 
tending toward P = 25 as t > oo. 


7. (a) i; (b) iii; (c) ii; (d) iv. 
9.y=17/2+sint+C. 
ll. y= t4/12+0+Ct+Q. 
13. y=sint—tcost+C. 
15.y=—heP +0. 
17.y = t?/2+sint —12?/8. 
I. y=A/lz¢e-—Br+ 2. 
21. y =sint —tcost+2. 


25. y = 3/2 isa stable equilibrium. 
27. y= 1and y = —1 are stable equilibria; y = 0 is unstable. 
29. y=1and y = 3 are unstable equilibria. 
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Section 2.3 


1. linear. 
3. nonlinear. 


5. nonlinear. 


7.97 =Ce- 
9.y= Cet, 
ll. y= Cesc t. 


13. y= C(100— t). 
15.y=—1/2+t+Ce, 

l7.y= i ea ee ta, 
19. y= (#7? ++ C)/(t? +1). 


2l.y=2+e. 
23. y= 10—5e-#/2, 
3.y=1. 


27. y = 3 —0.03t — 0.002(100 — t)’. 

29, y= =1jIs7= 1/26", 

31. y=2t-7e! —2t teh + eh +: (4-e)t’. 

33. D(f +g) = D(f) + D(g) and D(cf) = cD(f). 


Section 2.4 


1. 37.73 h. 
3. 129.66 min. 


5. (a) P’ = 0.002P +5, P(0) = 100; (b) P(t) = 2600e°! — 2500; (c) about 
102 thousand walleye more. 


7. (a) A’ =1.5— A/60, A(O) = 45; (b) A(t) > 90 as t > o«; (d) 65.92 min. 
9. 643.76 days. 

11. Use an integrating factor to show that T = (Tp — Tin)e7 aT. 

13. 13.08 h. 

15. 24.76 h. 


Section 2.5 
1. linear, separable. 
3. nonlinear, separable. 


5. nonlinear, separable. 
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7. linear, separable. 
9. linear, separable. 
11. linear, separable, exact. 
13. exact. 
15. ¢=Ge™, 
17. y=—1/(10t+ C). 
19. y= (1+ Ce?/*)/(1 — Ce?/*), 
2l.y=14+ Ct. 
23. y= —6+ Ct/3t—1. 
25.y=C/(2+#?). 
7.¢=-t2 Or +c). 
29. y= 3), 
31. y = —4/(40t — 41). 
33. y= (1— eft e/*}), 
35.y=142t. 
37. y =(—6+ 16t)/(3t— 1). 
39, y = 3/(2+ 1’). 
A.y==t+0r +1). 
43. Consider y = 0 and y = t?/4. This result does not violate the noted theorem 


since f (t, y) = (y) 1/2 does not have a continuous partial derivative with respect 
to y in a rectangle containing (0, 0). 


Section 2.6 


1. (a) (3) © vio = 4.08956; (b) y(t) = V8 4 t?. 
3. (a) With h = 0.1, y(1.5) © yi5 = 1.56309; with h = 0.05, y(1.5) © y39 = 
157217, 


5. (a) y(1) © yo = —0.76341; (b) y(t) = —2e-* 
7. y(1) © yo = 5.18748; (b) y = 2e!. 
9. y(1) © yi9 = 3.06501; (b) y(t) = /24t +1. 

11. yQ1) © yo = 7.56597. 

13. y(1) © y1o = 0.77258. 


Section 2.7 


1. (a) P is increases for 0 < P < A; (c) P increases most rapidly at the instant 
P=A/2;(d) M=(A-— Po)/Po. 
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3. (a) P=0 and P= 4; (b) P=0 is unstable, P = 4 is stable; (c) P = 2; 
(d) t = 47.58. 
5. P(t) = (6e** + 4)/(e?! + 4), which makes sense since this is an increasing 


function that tends to 6 as t > 00; the equilibrium solutions of the DE are P = 1 
(unstable) and P = 6 (stable). 


7. t © 9030.58. 
9. t © 2849s. 


11. (a) Because f(t, h) = —J/h does not have a continuous partial derivative 
with respect to h on a rectangle containing the point (1, 0); (b) because we have 
no idea what time the tank actually emptied; (d) the solution in (c) shows that 
for any time c < 1, there is a valid solution function which represents the tank 
emptying at time c. This demonstrates both the nonuniqueness of the solution 
and the fact that the problem is ill-posed since we do not know the time the tank 
actually emptied. 


Section 3.2 


1.4 = —1,5 with corresponding eigenvectors [—2 if! i. 

3.4 = —1,9 with corresponding eigenvectors [—3 iJ", f1 3)". 

5.4 = 1,4, 0 with corresponding eigenvectors [—2 1 i i HT i. 

7. 4 = 2,2, 2 with corresponding eigenvector [1 0 0]'. 

9. (a) A= |= sf (b) x = 0 is the only constant solution; (c) 4 = 1, 6 with 

corresponding eigenvectors v = [1 1", [2 7]'; (d) x(t) = e*[1 1]" and x,(t) = 
e127]! (e) x= ce[1 I) + oc% [2 7]'; (x= —He'fl Gue = 2e'[2 7]; this 
vector function has its length grow without bound as t > oo. 
ll. (a) A= ie a) (b) x = 0 is the only constant solution; (c) A = 
—2,—2 with corresponding eigenvector v = [1 O's (d) x(t) = e"[1 OF; 
(e) x = qe *"[1 0]!; (£) There is no value of c, for which the solution in (e) 
satisfies this IVP. This tells us we must not have found the correct general 
solution in (e). 


3-1 
to the given system, so there are infinitely many such solutions; (c) A =0, —4 with 
corresponding eigenvectors [1 3], [—1 1]'; (e) every solution is a straight line 
solution of form c;[1 3]'+ @e~*![—1 1]'; (f) x(t) = 3[1 3° — ze-4[-1 is 
which tends to the vector [1 3]! as t > oo. 


13. (a) A= = I | (b) Any vector of form x = x2[1 3]! isa constant solution 
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8 —1—-11 
15. (a) A= | 18 —3 —19 |; (b) x = 0 is the only constant solution; (c) A = 
2-1 —-5 


—4, —2,6 with corresponding eigenvectors v = [1 1 it —1ajeio: 
(d) x,(t) = e~*[1 1 1J!, x(t) = e7[1 —1 1J', and x3(t) = e[2 1 oO]; 
(e) x = c,X) + x2 + 3x3; (f) x = le~**[1 1 1]!, whichisa straight-line solution 
that approaches zero along the line through (1, 1, 1). 


0 1 
ae <= 
17. x’ = Ax where A= E ei 


19. x’ = Ax+ b(t) where A= E | and b(t) =[0 e']!. 
010 
21. x’ = Ax where A= 001 
—560 
23. xy = — Gia + gap + 35, xh = Goya — spp + 27. 


, [0.04 0.08 _f 25 
cia 0.04 —0.08 |} * =| 150 |: 
27. Use direct substitution with x’(t) = Ae*'v and Ax = A(e*'v) = e*' Av, along 
with the fact that Av = Av. 


Section 3.3 


1. 4; 7. 

3. 3; the given linear third-order homogeneous equation should also have a 
three-dimensional solution space. 

5. x(t) and x(t) are linearly independent. 

7. X,(t), X(t), and x3(f) are linearly independent. 

9. For two vectors, it’s equivalent to ask if they are scalar multiples of each 
other. 
11. (a) A has the repeated eigenvalue 1 = 3 with a single corresponding linearly 
independent eigenvector v = [1 0]'; (c) x(t) = cqe*[1 0]! + co(te*![1 oyi+ 
e10 1's (d) x(t) = 3e**[1 OJ! + 2(te®*[1 0] + e*[0 17. 
13.(a) A has complex eigenvalues 4 = +i with corresponding complex 
eigenvectors; (c) x(t) = c[cost sin t]' + o[—sint cost]'; (d) x(t) = 
3[cost sint]! +2[—sint cost]’. 
15.y=cjcost+cqsint. 


I7.y=aq+oe+oe'. 
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Section 3.4 

1. (a) x(t) =cye'{1 1]’+ce7-*[1/3 1]"]"; (b) The origin is a saddle point and 
therefore unstable. 

3. (a) x(t) = ce~*"[—1 1]? + ce~"[1 1]"]"; (b) The origin is a stable attracting 
node. 

5. (a) x(t) =a [1 1’ + @e*[-2 1]"]'; (b) Every point of form k[1 i? 
is an equilibrium solution of the system. Each is stable. (c) Every nonconstant 
solution is a straight line because only one of the terms in x(t) has an exponential 
function present. That term results in a straight-line solution; the added constant 
only shifts the line. 

7.x; (t) = e*[—1 2] and x2(t) = e~*"[1 2]! are straight-line solutions; the 
origin is an unstable saddle point. 

9. x,(t) = e® [1 1]? and x)(t) = e![—1 1]! are straight line solutions; the 
origin is an unstable repelling node. 


1/2 7/4 
rae UA 
13. x(t) =e'/2[1 17° +3e-'/2[1/3 1]. 
(3.x) S—3e 4 /2[=1 1)’ =e 1 1)", 
17. x(t) =3[1 1? +e*[-2 1". 
19. y=ae'+qe. 


21.7 =qe*+oe*. 
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1. The origin is a stable attracting node. 

3. The origin is an unstable saddle point. 

5. The origin is a stable center. 

7. The origin is an unstable repelling node. 

9. The origin is a stable center. 
ll 

13. The origin is a stable attracting node. 

15. (a) x(t) = q[cos2t sin2t]’ + o[—sin2t cos2r]"; (b) the origin is a stable 
center; (c) none. 

17. (a) x(t) =c[e~*! 0]' + e[te~?! e~2"]"; (b) the origin is a stable attracting 
node; (c) one, along the line through (0,0) in the direction of [1 oy. 
19. (a) x(t) = cpe*[2 1]7 + me?" (t[2_ 1]7 + [—-1 J"; (b) the origin is an 
Unstable repelling node; (c) one, along the line through (0,0) in the direction of 
[2 1]. 


. The origin is an unstable repelling node. 
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21. x(t) = —e*[sin3t —cos3t]! —3e?'[cos3t sin3t]!. 

23. x(t) = —7/3[sin3t 2 cos3t+ 4sin3t]! —2[cos3t — 2 sin3¢+ 2cos3t]!. 
25. (a) x(t) = qe“[-1 0 1]'+ oe*f1 1 OJ’ + c3ef1 —1 1)"; (b) the 
origin is an unstable repelling node; (c) there are three straight line solutions, as 
demonstrated in (a). 


27. The characteristic polynomial for a 3 x 3 matrix is a cubic polynomial, and 
thus must have at least one real zero. This forces the matrix A to have at least 
one real eigenvalue, and with it, at least one corresponding real eigenvector, thus 
generating at least one straight-line solution. A 4 x 4 matrix may possibly have 
all complex eigenvalues, and thus the system may have no straight line solution. 
In general, any time n is odd, an n x n homogeneous system is guaranteed at 
least one straight-line solution. 


29. y= ce‘ sin2t+ ce‘ cos2t. 


3ly=aqe'+oe%, 


Section 3.6 

Lx = qe (7 — 1) 1" + qe 2-VB—/7 -— 1) T+ 
(=a: 3), 

3.x=cqe*[1 1]'+cqe[—1 1]'+sint[—2/5 1/10]! +cost[—3/10 1/5]'. 

5. (a) Because the forcing function b(t) is constant, vxp = [4/3 — 7/3]; 
(b) x, = qe-‘[—1/2 1]'+ @e*[1/2 11°; (d) Vxp = [4/3 — 7/3]' is constant 
and thus an equilibrium solution. Since the eigenvalues have opposing signs, 
this equilibrium point is an unstable saddle. 


7. (a) Xp = [—2/3 —1/5e-7*_ — 1/3 — 2/5e~**]7; (b) x, = ce**[1/2 17+ 
ee (S121. 

9x=ce[l 1’+oe%[-1 194+ [1/9 —1/9]7. 1l.x=cefl oT + 
ee [1/2 1] +[=13/2e% 4/3e-*]'. 

13.x=c[cost sint]'+o[—sint cost]! +[2 3]. 

15.x=c[cost sint]’ + o[—sint cost]’ + [2+ 3e!/2 a= 2)". 
17.x=qe [1 1'4+cqe[1/3 1]' + [44 3/10sin3t — 1/5cos3t 8 —3/ 
10cos3t]'. 

19.x = —5/2e~"[1 O]' +11/18e"[1/2 1)’ +[—13/2e~*! 4/3e~7*]". 
21.x=-—lfcost sint]’ —5[—sint cost]' +[2+3e'/2 3—e'/2]'. 23. X»= 
[asin3t+ bcos3t+csin2t+dcos2t esin3t+f cos3t+gsin2t+hcos2t]'. 


Section 3.7 
1.X)=—17/5 13/5]7. 
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3. (a) Xp = [Ae’ Be']'; (b) x, = qe[—1 1]? + ce"%[1 1)", which includes 
the natural guess for xp, so that guess for a particular solution will fail to work; 
(c) X» =[te’ — (t+ Let]. 


= tae _ 
5. Xp =[—e = Ae ray, 
7. Xp =[—4cos2t+ 2 sin2t — 2 cos2t— 2 sin 2t]". 


9. Xp =[—11/12e~' —3/4te* 9/4te~*]". 
ll.xp=[-t-1 —t—-2]'. 


Section 3.8 


a _ [4/100 4/50 _ 25 
1. The IVP is x’ = Ax where A = | 4/100 —4/50 and x(0) = . The 


solution to the IVP is x(t) = : dag araat | which is a straight-line 
solution that tends to the stable equilibrium (50, 25) as t + oo. 


3. The matrix A in the system x’ = Ax + b stays the same as in #2, but the 
system is now homogeneous of the form x’ = Ax. As t > 00, x(t) > 0, which 
is consistent with the fact that the amount of salt in each tank will go to zero as 
time progresses. 

—~7/400 0 0 
5. The IVP is x’ = Ax +b where A= 7/400 —7/200 0 | and b= 
0 7/200 —7/300 
70 8000 
0 |,x(0) =| 10000 |. The general solution to the system is 
0 0 


0 0 1/6 4000 
x(t) = ce 7/3008 0 aE ce 7/2008 —1/3 Ae eye 7/4008 1/6 zis 2000 
1 1 1 3000 


from which we can see that our intuition is confirmed: with just one inflow 
putting brine at 10 g/liter into the system, eventually the concentration should 
stabilize throughout at a concentration of 10 percent by volume. The constants 
c, and c can be determined by applying the initial conditions; c) = — 15000, 
C2 = —12000, cz = 24000. 

01 
—40 
(c) x(t) = [2 cos(2t) — 2 sin(2t)]‘, soy=x,= 2 cos(2t). 


7. (a) y’+4y =0, y(0) = 0.4 and y’(0) = 0; (b) x’ = x, x(0) = [0.4 o]'; 


92 (a) yy ap — tos, 90) [03 end pK Oh = a is 


x(0) = [0.3 OJ; (c) x(t) = e~!/?"[—0.0775 sin(1.936t) + 0.3cos(1.936t) — 
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0.620sin(1.936t)]', so y = x, = e~!/*'[—0.0775 sin(1.936t) + 0.3 cos(1.936t). 
Thus the solution function oscillates and decays to zero as t > 00. 


11. (a) I” + RI! + 1001 = 0, 1(0) = 100, 1'(0) = 05 (b) x’ = le 


—100—R 
x(0) =[100 0]!; (c) (i) I = 100cos 10t, (ii) I = x; = e~**((400/3 sin6t + 
100cos6t), (iii) I = x) = =F H4(190 + 1000t), (iv) I = x; = 400/3e—>* — 
100/3e77"", 
Section 4.2 


ly=ce*+oe—*. 
3.y=cje'+oe'. 
5.y=ct+et. 


= = 


7.y=ce a ae +oe 2 
9.y=2e'. 

Ll. y=2/3+1/3e7**. 

13. y= —6e—* + 4e7*, 


V5 


15. y”—4y=0. 
17. y’ —4y' =0. 
19. y"=0. 


21. (b) the roots of the characteristic equation are the complex numbers r = 
1 + 23; (c) the two functions are linearly independent because neither is a scalar 
multiple of the other; (d) y = cre’ cos2t + qe’ sin2t. 


23. y=5e*—3eF, 
25. y = 325/2e—* — 125/2e—**. 


Section 4.3 
ly=ce*+ ote. 
3.y=qe/*4 wte*/?, 
5. y=, cos2t + ~cos2t. 
7.y=cqe*+ cyte. 

9. y= qe 4+ we, 

Ly =V3e—*/? sin. /3t/2 + e—*/? cosV/3t/2. 
.y =19/4e774 +. 9/4074 

15. y = 16/5e™' sin 5t — 3e>' cos5t. 


ee 
Oo 
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17.y=0. 

19. (a) y= ce?! + cyte**; (d) x, =y. 

21. The equation will have two real distinct roots when a? — 4ay > 0, one 

real repeated root when ay — 4ay = 0, and two distinct complex roots when 
2 

a; — 4a < 0. 


23. y = J3/3sin /3t + 2cos V3t. 
25. I = 100e~*! + 225te~*. 
27. I = 25/3sin3t + 100cos3t. 


Section 4.4 

l.y=ce* + me—7* +.5/4e"*. 

3.y=cje '}+oe'+11/2ter. 
5S.y=q t+ et+1/12t4 + 3/20. 

7.y = ce 7* 4+ ete *! 4+. 3/8 —1/2t+ 1/4¢?. 

9. y=, sin2t + © cos2t + 2e'(sint + 2cost). 

11. y= —5/7e* +.41/28e 7! + 5/4e7*, 

13. y= 1/4e—' — 13/4e' +11/2te’. 

15. y= —2—2¢+1/12t* + 3/227. 

17. y=37/8e 7! +51/4te—* + 3/8 —1/2t + 1/407. 


19. y= —7/2sin 2t — 4cos2t + 2e'(sint + 2cost). 


isin 
Sit 


21.y=cqsint+c@cost—cost-In= 
23.y=qe 7? 4 ote A+ vee 

25.y=cje'+ate' +te'(—1+Int). 

27. y = ce” + me ** = 1/10e-**(e** = 2e** + 2In(e* +: 1)e* = 2te* + e** = 
2e' + 2In(e’ + 1)). 

29. y = 1/2sin2t + 582/41 cos2t — 500/41 cos 74 it; yn and yp are each equi- 
oscillatory functions whose frequencies are neualy) equal. When added together, 
they sometimes cancel each other out, leading to widely varying behavior in the 
amplitude of y. 


31. y = 106.28¢~9-000625t _ 5.79 e~9.999t _ 0.481 cos 20t + 0.096 sin 201. 


Section 4.5 


1. y = 1/1000tsin5t; there is no maximum displacement of the mass as 
oscillations are unbounded due to resonance. 
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3. y = —1/72sin6t — 1/72cos6t + 1/72e°'; the displacement is unbounded, 
but resonance is not present. 


5. y = —1.98sin7.07t + 2sin7t; beats is present. The maximum displacement 
is approximately 3.98. 


7. Beats are present; y = cos 6.93t — cos7t. 
9. I= 10tsin 10¢; resonance is present. 


11. I = —80/33cos100t + 80/33cos10t; neither beats nor resonance is 
present. 


13.c¥ 2.5. 


Section 4.6 

ly=qe'+ae—'+cse**. 

y= eae ee >/2t + oye, 

5.y=aqe '+ate*+ct*e*. 

7.y=cle’+©cos2t+csin2t. 

9. y= ce! + me + c3e~' + sint + c cost. 
lly=qe'+ote'+ot?et*++ate—. 
13. y=1/24+1/4e77 + 1/4e”*. 

15. y = —3e** + 8e7# — 5et. 


17. y=5sin2t. 
19. y= 1742. 
21. y=e' + te — tel. 
23. y"”—y' =0. 


25. y) + yy + oy!” + 9y” = 0. 

27. v4) + yl” + 105/4y” + 25y’ + 125/4y = 0. 
29.y=c,cost+qsint+cotcost+ ctsint +7/32— 1/8t? cost. 
3ly=qe+qe'+oe*+1. 

33.y= cot eye 3/28 + cye2t — 6/325 cost — 17/325sint. 
35.y=cje'+qte'+ct?e—' —1/4cost — 1/4sint. 

37. y= ce’ +c. cos2t + c3sin2t —1/10e~*. 

39. y=cje'+ qe + ce + qsint+ ccost +7/2. 
Al.y=qe'+ote'+ te! + qte* —44 t—1/4cost. 
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Section 5.2 


sr 


1, limp-sge re = lumps 55 ae = lim;-so So = 0, where the second 


equality holds by an application of L’Hopital’s Rule. 
3. Consider applying L’Hopital’s Rule n times to lim,_, 45 Sy. 


e 


5. F(s)=3 

7. F(s)= 2 _ me 
9. F(s)= 4-2 
11. F(s)= +. 
13. F(s) = &. 
15. F(s) = Gap 


Section 5.3 
1. F(s) =3/s—1/(s—1). 


3, F(s) = (3/(s —2)) — (6/(s? +4)). 
5. F(s) = (4s/(s? +25)) + (6/(s +2)). 
7. F(s) =2/(s+1). 
9, F(s) = (24s(s? — 1))/(s? +.1)4. 
11. F(s) = (20/(s* + 25)) — (6/(s +2)). 
13. F(s) =2/(s+1)*. 

(s) 


=2/((s+1)? +4). 

17,( t) =cosh(2f) sin(3t) = $(e' +e‘) sin(3t) so F(s) = $(3/((s— 1)? +9))+ 
(3/((s+1)?+9)). 

19. F(s)= Bier) (7/((s+3)' +1)). 

21. F(s) = ((6s? — 3)/((s? + 1)9)) — (2s/(s? + 1)’). 

23. F(s) = 2(s+1)/((s+1)* +1). 

29. £[f (t)] = s* Lf (t)] — s3f (0) — s?f’(0) — sf”(0) — f’"(0). 


Section 5.4 
1. f(t) = u(t —1)— u(t —2). 


542 Appendix E: Solutions to selected exercises 


3. f(t) =t- [u(t —1) — u(t —2)] + 27 - u(t — 2). 
5. f(t) = sin(f) - [u(t) — u(t — 277)]. 
7. f(t 


9. F(s = _ ee 
—8 
11. F(s 2a te - 


13. y+ 5y/ + 3y =i sin(2t)- u(t—4) + 46(t— 10), y(0) = 0.25, y/(0) =0. 


)= 
j= 
)=t-[u(t) — u(t —2)]+2-[u(t—2) — u(t -—4)] + (4-1): u(t —4). 
) 
)= 


Section 5.5 
lLy=4-e, 
3.y=4-—e77. 
5. y =(F(-1+20)e —3)e*. 
7.y=u(t—3)(—4— att Bed) — 40%, 
y= 2 sin(3t). 
ll.y= ;sin(3t) ~ § cos(3t) + 5. 
13.y= 2tsin(3t). 
15. y = 2e7** + 6te~**, 
Iny = se sin(2t) + e~'cos(2t) — tu(t — 4) (— 1+(4 sin2(t —4)+ 
cos2(t —4))e-¥?), 
19. y = 3/2e—* +. 1/2e*? + 1/12u(t — 3)(—4 4 3e7 (9) + e3lt-9), 
21. y = (1/5)e7* — 11/5e7**, 
23. y= 1/4(-1+2t)e’ —3/4e*. 
25. y(t) = 5/6tsin3t. 
27. (a) y(t) = 1/36 — 1/36cos(6t); (b) y(t) = —1/5148e~*/? sin(1/2,./143t) 
J/143 — 1/36e~*/? cos(1/2./143t) + 1/36; (c) y(t) = —1/36e—* — 1/6te~%! + 
1/36; (d) y(t) = —1/32e77! + 1/288e7!8* + 1/36. 
29. (a) y = 5/72sin6t — 5/12tcos6t; (b) y = 5/143/858e~*/? sin /143t/2 + 
5/6e—*/? cos/143t/2 — 5/6 cos6t; (c) y= 5/72e~! +. 5/12te—® — 5/72. cos6t; 
(d) y= 3/64e7*! — 1/192e—18! — 1/24 cos6t. 


Section 5.6 
1. f(t) = (2—6t)e**. 
3. f(t) = —1/4—t/24+ 1/47". 
5. f(t) = 6/25 cos2t + 9/50sin 2t + 2/25(5t —3)e~'. 
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7. f(t) = 1/2u(t — 1)(sin(t — 1) — 2 sinh(t — 1) + (t — 1) cosh(t — 1)). 

9. f(t) =5/9u(t — 2)(—9 + 5e4('—-™) — ef -™(—44 15(t —))). 

1l. y=5/8sin2t — 1/4tcos2t. 

13. y = 3/4sin2t — 1/2tcos2t + 1/2u(t — 6)sin2(t — 6) + u(t — 12) 
sin2(t — 12). 

15. y=5/8e‘ sin2t — 1/4te~' cos2t. 

17.y=1/2e~ sin2t+1/8e‘(—2t cos2t+sin2t+(t+m)u(t—z)(1/2sin2t+ 
(a — t)cos2t)). 

19. y(t) = 17/18e~2" + 4/3te’ + 5/9e' — 1/2 — 1/3u(t — 3)e-2-9) + 1/3u 
(f=3)e". 

21. y(t) = 1/2e7'sin2t + 1/16e~"(t + 2)(—2tcos2t + sin2t) + 1/2u 
(t —5)e—'+9 sin 2(t — 5). 


Section 6.2 

1. (0,0), (1/2, 1/4). 

3. (km /2, jm /2), where k = +1,+5,+9,... andj =+1,+3,+5,.... 

5. The system has no equilibrium solutions. 

7. (0,0), (1/2, 1/4), (=1/2, 17. 

9. (42k7,0), where k = 0,1,2,.... At even multiples of z, the system 
demonstrates stable equilibria with stable centers nearby; at odd multiples of 


zt, the system shows unstable equilibria, which correspond to the pendulum 
starting in a vertical position. 


Section 6.3 
2x1 1 —2x, 1— 2x, 
5. J(x1, 2, %3) = 
2xi(1-44 <i | ca | oe 22x2(1-44 - | i | ie 2x1(1-44 x | x. | oe 
—2x e123 —2x e123 2x3 0781 3-3 
2 —6x) 4x3 


21 xj-1 
7. Fos) =| 3 | pore 


9. F(x), x2) © ee | [= _ al 
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% 


& ‘| EI near (1/2,1/4), see exercise 9; (c) the purely imaginary 


eigenvalues of the Jacobian matrix show that (0,0) is stable since nearby 
trajectories are approximately elliptical. 
13. (a) Equilibrium solutions: (km/2, j7/2), where k = +1,+5,+9,... and 


j = +1,4+3,45,...5 (b) near (2/2), (2/2), | | ~ Se 


7 
11. (a) Equilibrium solutions: (0,0), (1/2,1/4); (b) near (0,0), 13 | ~ 


(c) the repeated zero eigenvalue of the Jacobian matrix does not reveal useful 
information; a plot of the direction field nearby shows that (7/2, 2/2) appears 
to be unstable. 


15. There are no equilibrium points for this system. 
17. (a) Equilibrium solutions: (0,0), (1/2, 1/4), (—1/2, 1/4); (b) for example, 


10]] x» 


matrix of opposing signs show that (0, 0) is unstable since the nearby behavior 
is approximately that of a saddle point. 


/ 
near (0,0), 13 ~ i | E | (c) the two real eigenvalues of the Jacobian 


/ 
19. Near (0, 0), 12 | x es i E | the two purely imaginary eigenvalues 


of the Jacobian matrix show that (0, 0) is a stable center, which confirms what 
we expect for the pendulum. If the initial displacement and angular velocity are 
small, we expect the pendulum to oscillate indefinitely near its equilibrium. 


Section 6.4 

1. x(1) © x19 = 0.51614, y(1) © yo = 2.64423. 

3. x(1) © x19 = 1.00781, y(1) © yp = 3.04026. 
5. x(1) © x29 = 0.66581, y(1) © y20 = 0.86534. 

7. x(1) © x29 = 0.687028, y(1) © yoo = 0.302645. 


Section 7.2 
1. (a) yi9 = —0.763413361; (b) y19 = —0.738106789; (c) yp = —0.734305821; 
the exact solution at t = 1 is y(1) = —2e~! = —0.735758882. 


3. (a) yyq = 5.18748492; (b) y19 = 5.428161693; (c) yjo = 5.428161693; the 
exact solution at t = 1 is y(1) = 2e = 5.436563657. 
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5. (a) vio = 1.396393786; (b) vio = 1.553789505; (c) yo = 1.543274653; the 
exact solution at t = 1 is y(1) = tan1 = 1.557407725. 


7. (a) yio = 0.875101928; (b) vio = 0.877113041; (c) vio = 0.877113041. 
9. (a) yio = 0.827421159; (b) vio = 0.805202364; (c) vio = 0.804960517. 


Section 7.3 
1. (a) no= = el (b) vio = —0.735762133; the exact solution at t = 1 


is y(1)= = —0.735758882. 

3. (a) vio = 5.428161693; (b) vio = 5.436559488; the exact solution at t = 1 is 
y(1) = 2e = 5.436563657. 

5. (a) vio = 1.53289173; (b) yio = 1.557406443; the exact solution at t = 1 is 
y(1) = tan 1 = 1.557407725. 

7. (a) Vio = 0.879321827; (b) vio = 0.881752898. 

9. (a) Vio = 0.759536196; (b) y19 = 0.763163853. 


Section 7.4 


—0.303502219 


(10) Fi = 
1. (a) x epee a 


= 1.45329846 
0.3011686789 
1.381773291 


ee (b) x(10) = 


1.199804688 
(10) _ 
3. (a) x ah 600390625 


ae 


1.198181011 
0.603637979 


joe 


rel (b) x70 = 


| (c) x(1) 


0.6026951788 |’ 


1.244809581 
(10) _ 
5. (a) xO = | 1.721363223 


1.74092711 | | (c) x(1) = 


| 1. ea 

1.760866373 

7a) x00 | cscnasey | <0 =| 5 aogrezosa | 
Orc [eroinepred Tora pyeieeenet 
Usa Teesgeion IRS | seorncees | 
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(10) _ [ 0-694555012]. 4. (40) _ [ 0.694412729 
lays = eeeee sb) x°"= 1 0.3051542 | 


15. (a) xi =m, x, = —16x + 2+ 1; (b) xt = 0.331345434; (c) x = 
0.240361385; (d) x1(1) = y(1) = 0.252002804. 


17. (a) xf = x, x} = —16x? + 2sin24; (b) x(1° = 0.392559752; (c) x1 = 
0.418137228. 


Section 8.2 
1 RSA, 
3cR=5. 


19,6=P 94P /10=¢' /42; 
21. t— 3/184 £°/600 — t” /35280. 
23. ¢— #7 /144 £9 /312 — t!? /13680. 


Section 8.3 
1. ay + ay 2t/./m — a /287 /(6./70) + 02/21? /(40,/7r). 
3. ay + ant — 2a, t* — 4/3aot?. 
5. (ay + a) + (—ay — 5aq)t + (ay /2 + 25a /2)t? + (—ay/6 — 125a9/6)t°. 
7. (ay + a1) + (—2a, + 3a9)t + (2a) + 9/2a9)t? + (—4a1/3 + 9a9/2)t°. 
9. ag + ayt — 5agt? — § (ay + a1)t°. 

ll. ag + ait — Sat? - Sait?. 

13. dg tat — Zagt? + sat. 

15.1-5P-—iP +40. 


132 1,4 31 46 
17.1— 304 24 — sot. 
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Section 8.4 

3. Hint: write In at = In(1 +t) —In(1 — f£) and use the fact that In(1 + ft) = 
p= P DEP FHP Asus, 

5. Using 4 = 2 and the particular solution yp = 1, y = ¢, Po(t) + Q(t) + yp = 
e(5t? — 5) + (St? — 5)3 nt — St) +1. 

7. Using A = 5 and the particular solution yp = £ = > y = ¢Ps5(t)+ 
€Q5(t) + Yp- 

9. With A= YB y =1-Jaatiet+ daatna—vyat3eten. 
11. With A = 1/3, y= t— %(A-—DAt 2) + FA — 1) — 3)(0 + 2) 
(A+4) PP +--+. 


13. Hint: use A = 4 and the particular solution yp = 2 — ~ to write the general 
solution y = c Py(t) + cQu(t) + yp; find c, and c. 


Section 8.5 


1. Hy(t) = 16¢4 — 48t? + 12 and Hs(t) = 32t? — 160° + 1201. 

3. Using q=5,y=a +oQt+15qt?4+:--. 

5. Using q= 3, y=2+10t+6t7+---. 

7. Using q= 2 and yp = 4t, y = 4t + 1—4(t— 9/34 81°/5!4---). 

9. Since y(0) is finite, co = 0. With q = 3, y = c,L3, and the other initial 
condition implies y = 813(t). 

lly =—BLy(t)+t- }. 

13. y= q(t) + @Yo(t). 

15. y= cy Ja(t) + c Ya(t). 

17. y = qJ3(t), where c, = —3/J3(1). 


Section 8.6 

1. Using r = 1/2, y= t/2(1—t4 #7/64+---). 

3. Hint: multiply the DE by t on both sides and let p(t) = t — 2, q(t) = t. 
Using r=3, y= 0(1—t+17/2—17/6+---). 

5. Using r=5, y= (14+. 4t/54+5t7/12+---). 

7. Using r = 1/2, y = ¢'/7(1 — £/10 4 t7/28+---). 

9. Using r = V3, y = ¥3(1 — t/(1 +273) + 22/(4(1 + 2V3)(1 + V3) $+) 
ll. r(r—1)+pr+q=0. 
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